Everyday Tips and Tricks You Need to Know!
We’re almost halfway into 2024, the tech industry is moving faster than ever, due to the rise of Generative AI and Large Language Models.
Bonus: Read about the trending Technology behind Generative AI!, here.👈🏻
As a data enthusiast and python developer you should be knowing these cool tips and trick to stay relevant in todays competitive market, where lakhs of tech-people were laid off.
First thing to note here is that,
Why Python?
Python is an extremely powerful general purpose programming language, highly suitable for Data Science related tasks — utilizing it’s extensive libraries and awesome frameworks like SciKit-Learn, TensorFlow, PyTorch, etc., due to it’s easy-to-understand syntax.
This means having Python under your belt can provide you with flexibility to create large variety of applications, for example, automation tasks or building chatbots to interact with open-source LLMs, especially if you’re a beginner. As it’s quite straightforward and quick to learn as compared to other programming languages.
So, that being said let’s get into the coding, and explore,
How to write better Python codes?
I’ll be discussing 10 tips that can surely transform and elevate your coding practices.
If you find these tips helpful, do not forget to clap 👏🏻 and comment🖋️ down your thoughts!
1. Using “Generators” to Save Memory
Let’s say, you need to analyze a `large_file` that cannot fits into your RAM. To tackle this problem efficiently, you can create a `process_large_file` generator function, to read the large file line by line.
Generators not only helps in processing the large data into batches, but are also memory-efficient, as you don’t need to store the whole data in memory.
def process_large_file(file_path):
"""
Generator function to read a large file line by line.
"""
with open(file_path, 'r') as file:
for line in file:
# Process each line
print(line)
# Use generator to process the log file
log_file_path = 'path/to/your/large_file.txt'
# Process the large file
process_large_file(file_path)
The above code displays the output which is written in your `large_file.txt` file.
Even Keras Framework uses generator to leverage multi-processing (to load and process data in batches) in parallel to reduce training time.
2. Using “.setdefault()” in Dictionaries
Suppose, you’re managing an inventory system, and want to keep track of stock-levels of various products. When a new product is added to the system, to ensure that it has default stock level if it hasn’t been set yet.
You can use setdefault()
function to streamline this process by inserting key with a specified default value if the key is not already present in a dictionary.
# Initial inventory
inventory: dict[str, int] = {"jeans": 500, "top": 600}
# Add more products with default stock levels if they are not already present
products_to_add: list[str] = ["skirt", "shirt", "tee"]
for product in products_to_add:
inventory.setdefault(product, 500)
# Print the final updated inventory
print("Final updated inventory:", inventory)
"""
# Output:
Final updated inventory: {'jeans': 500, 'top': 600, 'skirt': 500, 'shirt': 500, 'tee': 500}
"""
This way you can avoid the need for explicit checks and assignments, making code more concise and readable.
3. Using Dictionaries to avoid “if-elif” hell
Suppose you have several functions, that you want to call depending upon the user input.
Well, the most common way of solving this problem is to use if-elif
conditions, but that approach can become very long and complex, while dealing with hundreds of functions.
The alternative approach would be to create a dictionary, that contains the key — you want to check against the functions to be run as value.
from collections.abc import Callable
# Function 1
def first():
print("Calling First Function...")
# Function 2
def second():
print("Calling Second Function...")
# Function 3
def third():
print("Calling Third Function...")
# Function Default
def default():
print("Calling Default Function...")
# User Input
options: int = int(input("Enter an option :", ))
# Dictionary to hold options as key and functions as values
funcs_dict : dict[int, Callable[[], None]] = {1: first, 2: second, 3: third}
# Checking if the key exist and incase it doesn't then deafult function will run
final_result = funcs_dict.get(options, default)
final_result()
When you run the program, you can see the following results on your screen.
"""
# Output:
# When Option send was 0
Enter an option :0
Calling Default Function...
# When Option send was 1
Enter an option :1
Calling First Function...
# and so on...
"""
Note: If user asks for anything random, it’ll run the `default()` function.
4. Using “Counter” from Collections Module
Trust me on this, while working with the large text data, the most common task in text analysis includes identifying key terms, for which you’ll need to determine the frequency of each word in the particular document or whole corpus depending upon the problem statement.
Counter
provides a simple and efficient way to count elements in an iterable, abstracting the complexity of writing custom counting logic.
Let’s implement this,
from collections import Counter
import re
# Read the text file
with open("sample_text.txt", "r") as file:
text = file.read()
# Clean and Tokenize the text
cleaned_text: str = re.sub(r'[^\w\s]', '', text.lower().split())
# Use Counter() to count the words
word_counts: Counter = Counter(cleaned_text)
# Printing second highest most common word
most_commmon = counter.most_common(2) # passed in Number will denotes how many common numbers we want (counting starts from 1-n)
print("Second Most Common Word is: ", most_commmon[0]) # print in zeroth index element from 2 most common ones
"""
# Output:
Second Most Common Number is: ('data', 82)
"""
Note: Additionally, you can also perform arithmetic operations, easily convert Counter
object to other data structure like dictionaries, and utilize it’s useful methods like element()
, most_common()
, etc.
5. Using “Memoization” to Optimize Code
Memoization is a technique used in dynamic programming to improve the time complexity of recursive algorithms, by reusing the expensive function calls when the same input occur again.
The classic example of this is the Rabbit Problem, popularly known as the Fibonacci Series.
import time
def memo_fibonacci(num: int, dictionary: dict[int, int]):
if num in dictionary:
return dictionary[num]
else:
dictionary[num] = memo_fibonacci(num-1, dictionary) + memo_fibonacci(num-2, dictionary)
return dictionary[num]
# Catching using a Dictionary
dictionary: dict[int, int] = {0: 1, 1: 1}
# Elapsed time
start_time: float = time.time()
# Calling the function
result: int = memo_fibonacci(48, dictionary)
end_time: float = time.time()
# Calculating the elapsed time
elapsed_time: float = end_time - start_time
print(f"Result: {result}") # Result: 7778742049
print(f"Elapsed time: {elapsed_time:.6f} seconds") # Elapsed time: 0.000133 seconds
Note: Certainly, this significantly reduces the time complexity. But remember that it comes with the space-time tradeoff, as you maintain a cache to store the results, that you need to take-care off.
6. Using “@decorators” to avoid Repetitiveness
Let’s say you’re building a python project and you want to time, how long a function takes to run. Surely as above, you can use time
functionality for that function but what if you’ve tens or may be hundreds of functions?
It would take forever to write ‘start-time’ and ‘end-time’, instead we can create a function `elapsed_time` to do the same for us. We’ll only have to add `@elapsed_time` over the function that we want to time.
What are the `@decorators` ?
Decorators are unique python functionality that wrap round your existing function to allow you to modify or enhance functions without changing the core logic, before or after a function executes.
Python sees, the @
symbol, and understand that this function under it needs to be passed into a function called `elapsed_time`, then the function runs in `elapsed _time` with the extra lines of code wrapped round it, to time any number of functions.
import time
def elapsed_time(func):
def wrapper():
start_time: float = time.time()
func()
end_time: float = time.time() - start_time
print(f"{func.__name__}() took {end_time:.6f} seconds")
return wrapper
@elapsed_time
def some_code():
# Simulating running code..
time.sleep(0.00002)
# Calling the function
some_code() # some_code() took 0.000009 seconds
They’re widely used for logging, timing, enforcing access control and more.
Note: But, it is recommended to don’t over do it though, because they can also obfuscate, what you’re code is actually doing.
7. Using `dataclass` for Clean Data Structures
It’s really tedious to write the __init__
method repeatedly in regular classes that’s designed to hold only the data values, due to the rise of potential errors.
However, featured in Python 3.7, the dataclasses
module is a more efficient way to store data that will be passed between different parts of a program.
Notice, in just few lines we can create less error prone data classes, without the need of writing a constructor and several other already implemented method manually.
from dataclasses import dataclass
@dataclass
class Employee:
id_: int
name: str
salary: float
e1 = Employee(id_=1, name='Tusk', salary=69999.99)
print(e1) # Employees(id_=1, name='Tusk', salary=69999.99)
Here, the output is also equivalent to the standard Python class implemented using __repr__
.
Note: We can also, customize the representation of our Employee class:
from dataclasses import dataclass
@dataclass
class Employee:
id_: int
name: str
salary: float
def __repr__(self):
return f"Employee(id_={self.id_}, name={self.name}, salary={self.salary})"
def __str__(self):
return f"{self.name} earns ${self.salary}."
e1 = Employee(id_=1, name='Tusk', salary=69999.99)
print(repr(e1)) # Employee(id_=1, name=Tusk, salary=69999.99)
print(str(e1)) # Tusk earns $69999.99.
Note:
— The
__repr__
method provides an unambiguous representation of theEmployee
object.— The
__str__
method provides a more readable and concise description of theEmployee
object.
Start using dataclass
if you aren’t, to reduce the boilerplate code, making your code much more readable and maintainable.
8. Using “match” for Clean Input Handling
Hot off since Python 3.10, the structural pattern matching has been added as match
patterns with associated case
statements.
Let’s say we have a class “Point”, that represents a point in the 2D coordinate system. Now, we’ll create a function “where_is” to handle the cases that a user inputs to find the point in the 2D plane.
A `match` statement takes an expression and compares its value to the successive patterns case block/s.
from dataclasses import dataclass
# Defining a class using dataclass
@dataclass
class Point:
x: int
y: int
# Match statements to handle different cases
def where_is(point):
match point:
case Point(x=0, y=0):
return ("Origin")
case Point(x=0, y=y):
return (f"Y={y}")
case Point(x=x, y=0):
return (f"X={x}")
case Point(x, y):
return("Somewhere else")
# To catch anything else that the user inputs
case _:
return("Not a point")
# Examples
print(where_is(Point(0, 0))) # Output: Origin
print(where_is(Point(0, 10))) # Output: Y=10
print(where_is(Point(10, 0))) # Output: X=10
print(where_is(Point(10, 10))) # Output: Somewhere else
print(where_is("Not a point")) # Output: Not a point
Using match-case
statements you can handle all the possible cases, ensuring exhaustive pattern matching.
9(A). Using “all” Operator instead “and”
Imagine you’re building a user profile system, and want to validate that all required fields in a form are filled out by the user (I don’t know why would you not mark *required in the form instead,🤷🏻♀️but let’s just concentrate here👇).
Well, you can use all
function that will return True
, if and only if all elements in the provided iterable are True
, instead of and
conditions.
# User input from a registration form
form_data: dict[str, str] = {"name" : "Nikita",
"email": "analyticalnikita@gmail.com",
"phone": "123478911"}
# List of reuired fields
required_fields: list[str] = ["name", "email", "phone"]
# Using all operator
if all(field in form_data for field in required_fields):
print("All required fields are filled out.")
else:
print("Some required fields are missing or empty.")
"""
# Output:
All required fields are filled out.
"""
9(B). Using “any” Operator instead “or”
The any
function returns True
, if any element provided in the iterable is True
.
For instance, you have to restrict some permissions to certain users, based upon certain criteria for that you can use any
, instead of or
condition.
# List of permissions to the user
user_permission: list[str] = ["read", "execute"]
# Check if user has at least one of the required permissions
required_permissions: list[str] = ["write", "read", "admin"]
# Using "all" operator
if any(permission in user_permission for permission in required_permissions):
print(f"Since you have required permissions. Access is allowed.")
else:
print("You're a standard user. Not allowed the access.")
"""
# Output:
Since you have required permissions. Access is allowed.
"""
These are some examples that shows how any
and all
can be used to simplify conditions that would otherwise require multiple or
or and
statements, respectively.
Last but not the least, every programmer’s all-time (hey, if not yours, then stick with me to learn these😉).
10. Using the Comprehensions for Shorter Syntax
Comprehension is the powerful toolkit python provide you with for all the iterable datatypes. This gives a concise way to avoid multiple-line loops using single line, depending upon the situation.
Let’s explore them one-by-one:
10(A). List Comprehensions
Here, I’m using a nested-if statements example, to show you the power of list comprehension
# Nested-if using List Comprehension
fruits: list[str] = ["apple", "orange", "avacado", "kiwi", "banana"]
basket: list[str] = ["apple", "avacado", "apricot", "kiwi"]
[i for i in fruits if i in basket if i.startswith("a")] # ['apple', 'avacado']
Similarly, you can also apply nested-for loops, until it make your comprehension hard-to-read.
10(B). Tuple Comprehensions
As such tuple comprehensions doesn’t exist in Python. Instead you can use generator expressions to create tuples.
# Generator expression converted to a tuple
tuple(i**2 for i in range(10))
# (0, 1, 4, 9, 16, 25, 36, 49, 64, 81)
10(C). Dictionary Comprehensions
Let say you’ve a list `apple_names` and you want to print the new list that contains the length of each elements of the list `apple_names`.
Surely, you can use list comprehension here. But do you know, you can actually use this notation to create a dictionary, it’s called, well yes, the dictionary comprehension.
# Creating a list of apple names
apple_names: list[str] = ["apple", "pineapple", "green apple"]
# Creating a dictionary with apple names as keys and their lengths as values
print({apple_name: len(apple_name) for apple_name in apple_names})
# {"apple": 5, "pineapple": 9, "green apple": 11}
Alternatively, it’s more readable than using loops or dict
constructor to create a dictionary.
10(D). Set Comprehensions
You can also use filtering based on certain conditions, in comprehensions.
# Creating a set with condition
print({i**2 for i in range(1, 11) if i > 5})
# {64, 36, 100, 49, 81}
Note: While comprehensions are expressive, this doesn’t means they’re suitable for all the cases, especially involving too complex logic.