Why Python is Irreplaceable for Data Science, AI, and Automation - a Design Perspective

Python is currently the most popular programming language for machine learning and data science. In this post, I will share my subjective experience on why I've stuck with Python for the last 3 years, and why I think Python's design is the future (for general purposes, not only AI, Data Science, and Automation).

Currently, I have 7 years of coding experience, jumping from one use case to another, trying to find my passion when I was younger. I have (production-level) experience on Java, C++, Flutter, TypeScript, and Python. In this blog, I want to share my subjective experience on why I’ve stuck with Python (and TypeScript actually) for the last 3 years, and why I think Python’s design is the future (for general purposes, not only AI, Data Science, and Automation). It might still get beaten one day by an “English programming language” powered by LLMs in the next decade, but for now, Python is on another level.

Language Extensions and Bindings

This is the principle that I think would be the future. With all of python benefit (that we would describe in the following section), it comes with a cost of high performance degradation. And the solution of this degradation is really smart. Python has a lot of binding capabilities which enable python user to call C/C++ library seamlessly. While interpreted languages are typically slower, Python can interface with C/C++ libraries seamlessly. This is why NumPy, which does heavy computation in C, feels native to Python users.

# ctypes for calling C libraries
import ctypes

# Load system library
libc = ctypes.CDLL("libc.so.6")  # Linux
# libc = ctypes.CDLL("msvcrt.dll")  # Windows

# Call C function
libc.printf(b"Hello from C: %d\n", 42)

This feels like a free lunch where we have high level language that are easy to understand but also performant. This is why Python dominates in scientific computing, AI, and data science. Libraries like TensorFlow, PyTorch, and scikit-learn perform complex mathematics computation but they actually implemented on top of optimized C/C++ code, they are calling C++ code! One of torch developer once said that a good way of designing performant python library is to write less python code.

Python place human more as the control more than any programming language. With this capability, you could write your own imagination flexibly and performantly, things that does not exist in other programming language even another dynamic language.

Dynamic Typing, Introspection, and Monkey Patching

Python’s dynamic typing (duck typing) lets me focus on what an object can do rather than what it is. I don’t need to write heavy type declarations - I just pass in any object with a read() method and move on. Tools like dir(), hasattr(), and isinstance() let me inspect and play around at runtime, which feels super freeing when prototyping.

But here’s where Python shines: duck typing enables polymorphism based on behavior rather than explicit inheritance. “If it walks like a duck and quacks like a duck, it must be a duck.” This is fundamentally different from Java’s rigid type system where everything must be declared upfront.

class DataProcessor:
    def process_csv(self, data): return "CSV processed"
    def process_json(self, data): return "JSON processed"
    def process_xml(self, data): return "XML processed"

processor = DataProcessor()
data_type = "json"
method_name = f"process_{data_type}"

# Dynamic method calling
if hasattr(processor, method_name):
    method = getattr(processor, method_name)
    result = method("sample_data")
    print(result)  # JSON processed

This is a bad design actually, in production code, you should use static typing and Protocols.

Why don’t we use inheritance? Using excessing inheritance (Multi-level inheritance) is a bad design due to high cognitive load finding which function are inherited into this particular level and which don’t. A lot of popular open source python library I’ve worked with. Most of them use only up 2 level, (abstract class and concrete class inheriting from it).

Now the important part, monkey patching. This is crucial need that do not exist in other language. There was a case where I need to modify an open source library to fit my use case. I could understand how the library work and perform monkey patching.

import GPT2LMHeadModel

# Lets say that output is a step by step of function composition
# gpt2("hello") = gpt2.layer[2]( gpt2.layer[1]( gpt2.layer[0]("hello") ) )
# Lets say I want to experiment modifying the second layer to perform a normalization
gpt2 = GPT2LMHeadModel() 

def custom_normalization_layer(self, previous_layer_output):
    mean = previous_layer_output.mean()
    std = previous_layer_output.std()
    return (previous_layer_output - mean) / std

GPT2LMHeadModel.layer[1] = custom_normalization_layer

print(gpt2("hello")) # This will now perform a normalization on the second layer

Everything is an Object: The Whole Language Described in a Single Abstraction

This is where Python becomes philosophical. Everything - functions, classes, modules, even types themselves - is an object. This single abstraction makes it easy to understand and master, with this single understanding, you could write and understand advanced code easily. Coming from Java, where primitives exist separately from objects, this was mind-blowing.

def my_function():
    return "Hello"

# Functions are objects
print(type(my_function))  # <class 'function'>
print(my_function.__name__)  # my_function

# You can add attributes to functions
my_function.description = "A greeting function"
print(my_function.description)  # A greeting function

# Classes are objects too
class MyClass:
    pass

print(type(MyClass))  # <class 'type'>
instance = MyClass()
print(type(instance))  # <class '__main__.MyClass'>
instance.name = "My Instance" # injecting attribute to instance
print(instance.name)  # My Instance

# Modules are objects too
import math
print(type(math))  # <class 'module'>
print(math.__name__)  # math

Every Class State Automatically Managed on a Single Attribute

The __dict__ attribute essentially defines the object entirely. This dynamic nature allows for runtime modification of classes and instances. Amazingly, you could modify this behavior. For example pytorch modify this state management behavior where adding new neural network layers or modules would automatically register the new module to the state management making loading and saving model easier. You could also modify inner class attribute easily making you having full control over the object.

class FlexibleClass:
    def __init__(self, name):
        self.name = name

obj = FlexibleClass("test")
print(obj.__dict__)  # {'name': 'test'}

# Add attributes dynamically
obj.age = 25
obj.email = "test@example.com"
print(obj.__dict__)  # {'name': 'test', 'age': 25, 'email': 'test@example.com'}

# Even add methods dynamically
def get_info(self):
    return f"{self.name} is {self.age} years old"

obj.get_info = get_info.__get__(obj, FlexibleClass)
print(obj.get_info())  # test is 25 years old

Multiple inheritance is supported, but Python’s Method Resolution Order (MRO) makes it actually usable. Unlike C++ where diamond inheritance can be a nightmare, Python’s MRO follows a predictable linearization:

class A:
    def method(self): return "A"

class B(A):
    def method(self): return "B"

class C(A):
    def method(self): return "C"

class D(B, C):
    def method(self): return "D"

print(D.mro())  # [<class '__main__.D'>, <class '__main__.B'>, <class '__main__.C'>, <class '__main__.A'>, <class 'object'>]

d = D()
print(d.method())  # D

Although like I said multi-level inheritance is a bad design, this opens up the flexibility of using multiple inheritance into an interface segregation design.

This flexibility enables complex object creation patterns that would be verbose or impossible in other languages. Popular libraries like SQLAlchemy use this extensively for their ORM magic.

Conclusion: Human-Centric Design

Here’s what I’ve realized after 7 years: Python trusts developers. Unlike languages that enforce strict rules to “protect” you from mistakes, Python gives you rope and trusts you not to hang yourself. This might sound dangerous, but it’s actually liberating. The limitation is the human capability and creativity!

In Java, you write code to satisfy the compiler. In Python, you write code to express your ideas. The flexibility that makes Python “dangerous” in enterprise Java environments is exactly what makes it powerful for innovation.

The flexibility that makes some developers write spaghetti code is the same flexibility that lets experienced developers create elegant, maintainable systems. The assumption is that developers are smart enough to understand code without excessive type annotations and rigid structures. This leads to more readable, concise code that focuses on business logic rather than language ceremony. Yes, this flexibility can lead to tactical programming when misused. But in the hands of thoughtful developers, it enables strategic thinking and elegant solutions.

Why This Matters for AI, Data Science, and Automation?

Rapid Prototyping: Dynamic typing lets you experiment quickly without boilerplate
Interfacing with Native Code: AI libraries need performance-critical C/C++ backends
Human Readability: Data scientists and researchers need code that’s easy to understand and modify. They aren’t programmer who only deals with code.