5 Data Transformation Tricks - Python in Excel:
- V E Meganathan
- 17 hours ago
- 2 min read
Trick - 1:
Extract 2 Characters at Each Position:
t = 'RickRothstein'
[''.join(i) for i in zip(t,t[1:])]
or,
[b+c for b,c in zip(t,t[1:])]

Today, I explored the task of extracting two-characters from each position within a string. Driven by my interest in learning Python, I sought an effective approach to accomplish this.
Consider a sample input string such as 'rothstein'.
Since strings in Python are iterable, I began by applying the zip function to two versions of the text.
The first version contains the complete string,
while the second starts from index 1 (essentially the second character onward) through the remainder of the string.
The zip function generates a nested list structure, where each inner list contains one character from the first string with the corresponding character from the second. By default, zip truncates the output to match the length of the shorter input.
Trick - 2:
Unpack a List or Series:
Approach 1:
Assign each item to a named variable
L = ['B', 'A', 'Jan', 121]
From, To, Mnt, Qty = L
'-'.join(sorted([From, To])), Mnt, Qty
Each variable name on the left matches one item from the list, in order.
Approach 2:
Assign corresponding items and collect the rest
From, To, *a = L
'-'.join(sorted([From, To])), a[0], a[1]
From gets 'B', To gets 'A', and the remaining items ('Jan' and 121) are collected into the variable a.
Trick - 3:
Row wise Aggregation:
Source Table contains Match #, Team1 Goal and Team2 Goal.
Our objective is to calculate Total goals per match.

Method-1:
df = xl("A1:C11", headers = True)
df['Total Goals'] = np.add(df['Team1'].values,df['Team2'].values)
df
Method-2:
df = xl("A1:C11", 1)
df['Total Goals'] = df.apply(lambda x: sum(x[1:]), 1)
df
Method-3:
df = xl("A1:C11", 1)
df['Total Goals'] = [b+c for b,c in zip(df['Team1'], df['Team2'])]
df
Trick - 4:
Unpivot Data:

Our input data is in Cross Tabulated format.
Our Goal is to transform the input table into Proper dataset.
df = xl("A1:D5", headers = True)
df.melt(id_vars=['Product'],var_name='Month',value_name = 'Sales')
Trick - 5:
Row and Column Total:
Source table includes month wise sales.
Our goal is to calculate Row and Column Total.

df = xl("A1:D5", 1)
df['Total'] = df.sum(axis = 1)
df.loc['Total'] = df.sum(axis=0)
df.reset_index(drop=True)





Comments