Concatenation in Python

0
937
Concatenation in Python

Introduction to Concatenation in Python

So What is Concatenation in python, Data structure of the python is a data frame when you want to join these multiple data frames according to their columns or rows data value for this process merging and concatenation is used. With the help of pandas function we operate this operation on the data frame the functions are pd.merge(),pd.concat().

Merging

Merge multiple data frames using common columns and their keys for this we used pandas.merge function.

There are three different methods of merging

1. On the basis of the Same columns in different data frames.

Here we pass the same column name of two different data frames and merge according to their values in the common column.

Syntax of merge:-

pd.merge(First DataFrame, Second DataFrame, how = ‘Method’, on= ‘Column name’)


Widget not in any sidebars

There are four different ways of merging.

  • Left merge
  • Right Merge
  • Inner Merge
  • Outer Merge

Inner Merge

In this merge, we merge two different data frames according to the same value that occurs in data.

Inner join -- courtesy of codinghorror.com

Example:- Create Two Different data frames and merge them according to the same columns.

dframe_1 = DataFrame({'key':['x','z','y','z','x','x'],'Dataset1':np.arange(6)})
print('First DataFrame : \n',dframe_1)

dframe_2 = DataFrame({'key':['q','y','z'],'Dataset2':[1,2,3]})
print('\nSecond DataFrame : \n',dframe_2)

merge = pd.merge(dframe_1,dframe_2,on='key',how='inner')     
# on keyword is used for select columns Columns,method is inner merge   

print("\nMerging of two data frame first and second according to same column('Key') \n",merge)

Output:-

First DataFrame : 
   key  Dataset1
0   x         0
1   z         1
2   y         2
3   z         3
4   x         4
5   x         5

Second DataFrame : 
   key  Dataset2
0   q         1
1   y         2
2   z         3

Merging of two data frame first and second according to same column('Key') 
   key  Dataset1  Dataset2
0   z         1         3
1   z         3         3
2   y         2         2

In the merging output, merge according to the same columns and return the common values in the between two data frames

Left Merge

In the part suppose you want to merge two different data frames according to left merge it gives you the output as getting the all value of left data frame and common value of right data frame. 

Left Join
Code:-
dframe_1 = DataFrame({'key':['x','z','y','z','x','x'],'Dataset1':np.arange(6)})
print('First DataFrame : \n',dframe_1)

dframe_2 = DataFrame({'key':['q','y','z'],'Dataset2':[1,2,3]})
print('\nSecond DataFrame : \n',dframe_2)

merge = pd.merge(dframe_1,dframe_2,on='key',how='left')     
# ON keyword is used for select columns Columns # method is inner merge   

print("\nMerging of two dataframe first and second according to same column('Key') and method is left :\n”,merge)

Output:-
First DataFrame : 
   key  Dataset1
0   x         0
1   z         1
2   y         2
3   z         3
4   x         4
5   x         5

Second DataFrame : 
   key  Dataset2
0   q         1
1   y         2
2   z         3

Merging of two dataframe first and second according to same column('Key') and method is left merge :
   key  Dataset1  Dataset2
0   x         0       NaN
1   z         1       3.0
2   y         2       2.0
3   z         3       3.0
4   x         4       NaN
5   x         5       NaN

Here we apply left merge means collect all values from a left data frame (first data frame) and common value of right data frame(second data frame) So the value q is not present in the output data because the value is not present in the left data frame (first data frame).


Widget not in any sidebars

Right merge

In this merge, it gives the output as the all value of the right data frame and common value of the left data frame.

Example:-
dframe_1 = DataFrame({'key':['x','z','y','z','x','x'],'Dataset1':np.arange(6)})
print('First DataFrame : \n',dframe_1)

dframe_2 = DataFrame({'key':['q','y','z'],'Dataset2':[1,2,3]})
print('\nSecond DataFrame : \n',dframe_2)

merge = pd.merge(dframe_1,dframe_2,on='key',how='right')     
# ON keyword is used for select columns Columns # method is inner merge   

print("\nMerging of two data frame first and second according to same column('Key') and method is right merge :\n",merge)

Output:-
First DataFrame : 
   key  Dataset1
0   x         0
1   z         1
2   y         2
3   z         3
4   x         4
5   x         5

Second DataFrame : 
   key  Dataset2
0   q         1
1   y         2
2   z         3

Merging of two data frame first and second according to same column('Key') and method is right merge :
   key  Dataset1  Dataset2
0   z       1.0         3
1   z       3.0         3
2   y       2.0         2
3   q       NaN        1

Here we apply right merge so collect all values from a right data frame (second data frame) and common value of left data frame(first data frame) So the value x is not present in the output data because the value is not present in the right data frame (second data frame).

4. Outer Merge

The output of the outer merge is to collect all value from both of the data frames left and right data frame.

Example:-
dframe_1 = DataFrame({'key':['x','z','y','z','x','x'],'Dataset1':np.arange(6)})
print('First DataFrame : \n',dframe_1)

dframe_2 = DataFrame({'key':['q','y','z'],'Dataset2':[1,2,3]})
print('\nSecond DataFrame : \n',dframe_2)

merge = pd.merge(dframe_1,dframe_2,on='key',how='outer')     
# on keyword is used for select columns Columns # method is inner merge   

print("\nMerging of two dataframe first and second according to same column('Key') and method is outer merge :\n"
      ,merge)

Output:-

First DataFrame : 
   key  Dataset1
0   x         0
1   z         1
2   y         2
3   z         3
4   x         4
5   x         5

Second DataFrame : 
   key  Dataset2
0   q         1
1   y         2
2   z         3

Merging of two dataframe first and second according to same column('Key') and method is outer merge :
   key  Dataset1  Dataset2
0   x       0.0       NaN
1   x       4.0       NaN
2   x       5.0       NaN
3   z       1.0       3.0
4   z       3.0       3.0
5   y       2.0       2.0
6   q       NaN       1.0

Here all value present in the output of merge 

2.  On the basis of the column value and index of the data frame

In this part, we select a column of one data frame and index of another data frame and apply merge to merge on the basis of the column and index of the data frame.

Syntex:- pd.merge(left , right, how = ‘Method’, left_on = ‘Column name’,right_on =’Column name’, left_index=True,right_index=True’)

Following are parameters:-

First DataFrame = Name of data frame object.

Second DataFrame = Name of second data frame object.

how = Methods of merging ( left/ inner/ right/ outer )

left_on = To select left data frame columns, here we pass column name

right_on = To select right data frame columns, here we pass column name

Left_index = To select left data frame index, here we pass True 

Right_index = To select right data frame index, here we pass True

Given data frames to perform merging.

Code:-
df_left = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_left)

df_right = DataFrame({'group_data':[10,20]}, index = ['x','x'])
print('\nSecond DataFrame : \n',df_right)

Output:-
First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4

Second DataFrame : 
    group_data
x          10
x          20

Example 1:-  Select left data frame column and right data frame index and method is inner merge.

Note:- Use above given data frame for merging


Code:-
merge = pd.merge(df_left,df_right,left_on='key',right_index=True,how='inner') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to column and index of data frame method is outer merge : \n\n", merge)

Output:-
Merging of two data frame first and second according to values of column and index of data frame method is outer merge : 

   key  data  group_data
0   x     0          10
0   x     0          20
1   x     1          10
1   x     1          20

Merge output contains all x values in key columns because we apply the method inner and so x is a common value in the column and index of the data frame.

Example 2:- Select the left data frame column and right data frame index and the method is the outer merge.

Note:- Use above given data frame for merging

Code:-
merge = pd.merge(df_left,df_right,left_on='key',right_index=True,how='outer') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of column and index of data frame method is outer merge : \n\n"
      , merge)
Output:-
Merging of two data frame first and second according to values of column and index of data frame method is outer merge : 

   key  data  group_data
0   x     0        10.0
0   x     0        20.0
1   x     1        10.0
1   x     1        20.0
2   y     2         NaN
3   y     3         NaN
4   z     4         NaN

Here We apply the Outer merging it shows all values of both column and index of data.

Example 3:- 

Note:- Use above given data frame for merging

Code:-
merge = pd.merge(df_left,df_right,left_on='key',right_index=True,how='left') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of column and index of data frame method is outer merge : \n\n", merge)

Output:-
Merging of two dataframe first and second according to values of column and index of data method is outer merge : 

   key  data  group_data
0   x     0        10.0
0   x     0        20.0
1   x     1        10.0
1   x     1        20.0
2   y     2         NaN
3   y     3         NaN
4   z     4         NaN

Here we apply the left method to merge on the basis of left column value and common value of the right data frame index.

Example 4:-

Note:- Use above given data frame for merging

# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of column and index of data frame method is outer merge : \n\n", merge)
 Output:-
Merging of two data frame first and second according to values of column and index of data frame method is outer merge : 

   key  data  group_data
0   x     0          10
1   x     1          10
0   x     0          20
1   x     1          20

	Here we apply the method right merge so they take all index values of the right data frame and common values of the left data frame column.

3. Merging on the basis of the index of two different data frames.

Here we select the index of both of the data frames to merge. On the basis of their same value in index.

Given Data frame are as follows

print('First DataFrame : \n',df_left)

df_right = DataFrame({'group_data_1':[10,20,30]}, index = ['x','x','y'])
print('\nSecond DataFrame : \n',df_right)

Output:-
First DataFrame : 
    group_data
x          10
x          20

Second DataFrame : 
    group_data_1
x            10
x            20
y            30

Example 1:- Apply inner merge on the basis of the index of columns.

Note:- Use above given data frame for merging

Code:-
merge = pd.merge(df_left,df_right,left_index=True,right_index=True,how='outer') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of index and index of data frame method is outer merge : \n\n", merge)


Output:-
Merging of two dataframe first and second according to values of index and index of data frame method is outer merge : 

    group_data  group_data_1
x        10.0            10
x        10.0            20
x        20.0            10
x        20.0            20
y         NaN            30

Example 2:- Apply right merge on the basis of the index of columns.

Note:- Use above given data frame for merging

merge = pd.merge(df_left,df_right,left_index=True,right_index=True,how='right') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of index and index of data frame method is right  merge : \n\n", merge)

Output:-
Merging of two dataframe first and second according to values of index and index of data frame method is right merge : 

    group_data  group_data_1
x        10.0            10
x        20.0            10
x        10.0            20
x        20.0            20
y         NaN            30
merge = pd.merge(df_left,df_right,left_index=True,right_index=True,how='left') 
# left_on = For selecting the columns of left dataframe
# right_index= For selecting the index of right dataframe
# how = method of merging

print("\nMerging of two data frame first and second according to values of index and index of data frame method is left merge : \n\n", merge)

Output:-

Merging of two data frame first and second according to values of index and index of data frame method is outer merge : 

    group_data  group_data_1
x          10            10
x          10            20
x          20            10
x          20            20

Join in Data Frame

If you want to join a data frame according to their columns of data use .join function.and also apply the method (inner / Outer )

Example1:-Join First Data Frame to another Data frame using .join function.
 Code:-
df_1 = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_1)

df_2 = DataFrame({'para': list('ABCDEF'),'values': np.arange(10,16)})
print('First DataFrame : \n',df_2)

result = df_1.join(df_2)
print('\n Join data frame :- \n\n',result)


Output:-

First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
First DataFrame : 
   para  values
0    A      10
1    B      11
2    C      12
3    D      13
4    E      14
5    F      15

 Join data frame:- 

   key  data para  values
0   x     0    A      10
1   x     1    B      11
2   y     2    C      12
3   y     3    D      13
4   z     4    E      14
	

here join on the basis of the same shape of the data value in the first data frame.

Example:- Apply method outer joining in the join function.

Code:-

df_1 = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_1)

df_2 = DataFrame({'para': list('ABCDEF'),'values': np.arange(10,16)})
print('First DataFrame : \n',df_2)

result1 = df_1.join(df_2,how='outer')
print('\n Join using Outer method :- \n',result1)

Output:-
First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
First DataFrame : 
   para  values
0    A      10
1    B      11
2    C      12
3    D      13
4    E      14
5    F      15

 Join using Outer method :- 
    key  data para  values
0    x   0.0    A      10
1    x   1.0    B      11
2    y   2.0    C      12
3    y   3.0    D      13
4    z   4.0    E      14
5  NaN   NaN    F      15

The value is NaN means that data does not contain in a first data frame

Example:- Apply to join select according to common columns values use on keyword to select column in join function

Code:-
df_1 = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_1)

df_2 = DataFrame({'para': list('ABCDEF'),'values': np.arange(10,16)},index=list('xxyyzw'))
print('First DataFrame : \n',df_2)

Result = df_1.join(df_2,on='key')
print('\n join on the basis of key column value of data :\n\n ',Result)

Output:-
First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
First DataFrame : 
   para  values
x    A      10
x    B      11
y    C      12
y    D      13
z    E      14
w    F      15

 join on the basis of key column value of data :

    key  data para  values
0   x     0    A      10
0   x     0    B      11
1   x     1    A      10
1   x     1    B      11
2   y     2    C      12
2   y     2    D      13
3   y     3    C      12
3   y     3    D      13
4   z     4    E      14


Concatenation in Python

Concatenation in python is used for the join to a different data frame according to rows and columns.

Syntax to concat two data frame:-

pd.concat( fist data frame, second data frame, axis = 0/1)

Parameters:-

axis = 0 -> concat data into the row

axis = 1 -> concat data into columns

axis_diagram

Example 1:- Concat two data frames according to row-wise.

Code:-

df_1 = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_1)

df_2 = DataFrame({'key': list('ABCDE'),'data': np.arange(10,15)})
print('First DataFrame : \n',df_2)

concat_data = pd.concat([df_1,df_2],axis=0)

print("\nConcat two dataframe according to row wise : \n\n",concat_data)

Output:-

First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
First DataFrame : 
   key  data
0   A    10
1   B    11
2   C    12
3   D    13
4   E    14

Concat two data frame according to row-wise : 

   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
0   A    10
1   B    11
2   C    12
3   D    13
4   E    14

Example 2:- Concat two data frames according to column-wise.

Note:- In this column Concatenation in python the number of rows of both data frames is needed to be the same.

Code:-

df_1 = DataFrame({'key': list('xxyyz'),'data': np.arange(5)})
print('First DataFrame : \n',df_1)

df_2 = DataFrame({'para': list('ABCDE'),'values': np.arange(10,15)})
print('First DataFrame : \n',df_2)

concat_data = pd.concat([df_1,df_2],axis=1)

print("\n Concat two dataframe according to row wise : \n\n",concat_data)

Output:-
First DataFrame : 
   key  data
0   x     0
1   x     1
2   y     2
3   y     3
4   z     4
First DataFrame : 
   para  values
0    A      10
1    B      11
2    C      12
3    D      13
4    E      14

Concat two data frame according to column-wise : 

   key  data para  values
0   x     0    A      10
1   x     1    B      11
2   y     2    C      12
3   y     3    D      13
4   z     4    E      14

Conclusion

In this blog, you will get a better understanding of how to merge/join multiple data frames according to their columns and the index of the data frame and the knowledge of types of merging and how to do Concatenation in python two different data frames.

LEAVE A REPLY

Please enter your comment!
Please enter your name here