Correlation doesn’t need to be MxN. all you are doing is checking correlation between N
columns so it will be NxN
matrix. From the N*N
you can consider the ones which you like and neglect the others.
import seaborn as sns
import pandas as pd
from io import StringIO
df = pd.read_csv(StringIO('''TWEET, A1, A2, B1, B2, B3
tweet text, 0.23, 0.54, 120, 60, 39
tweet text, 0.33, 0.7, 70, 20, 36
tweet text, 0.8, 0.41, 68, 52, 29
'''),sep=',')
print(df.corr()) # Pandas correlation matrix
sns.heatmap(df.corr(),annot = True)
Output:
A1 A2 B1 B2 B3
A1 1.000000 -0.732859 -0.661319 0.167649 -0.991352
A2 -0.732859 1.000000 -0.025703 -0.793614 0.637235
B1 -0.661319 -0.025703 1.000000 0.628619 0.754036
B2 0.167649 -0.793614 0.628619 1.000000 -0.036827
B3 -0.991352 0.637235 0.754036 -0.036827 1.000000
CLICK HERE to find out more related problems solutions.