Plt plt and axes the diagram title plt grafik axis labels plt plt axis value ranges
Hands-on statistics and visualisation for MSc projects
Dr Benjamin Inden
comes with lots of special-purpose add-on packages. Language similar to Matlab/Octave.
▶ SPSS: Widely used in business, social sciences, etc.
Recommendations
▶ If you only need to do simple statistics and visualisation, and are not condent in your programming skills, Excel might be a good choice study video course Master Excel for data science etc. on Linkedin Learning
▶ If you want to work in the biomedical industry, it might be worth looking at R
▶ If you want to work in management, it might be worth looking at SPSS
▶ Look at required skills in job adverts, ask your supervisor▶ In many cases, Python tools are a good choice because they are free to use, come with lots of examples, and can be easily extended see tutorial Statistics in Python ( ), on which some of the following slides are based
▶ pyplot: simplied state based interface (simar to Matlab)▶ pylab: pyplot and numpy completely imported
▶ pandas: Data structures and methods for statistics
CSV les
▶ A standard le format for interchange of statistical data, can be opened by Excel, R, pandas, etc.
Data frames are a data structure to hold tables of data (including column names, row numbers, etc.)
import pandas
data = pandas.read_csv('examples/brain_size.csv', sep=';', na_values=".")
Watch
d1 = [random.gauss(0.4, 0.5) for i in range(30)]
d2 = [random.gauss(0.7, 0.3) for i in range(30)]
t_stat, p_value = scipy.stats.ttest_ind(d1, d2)
print('t-test assuming equal variance:', p_value)
t_stat, p_value = scipy.stats.ttest_ind(d1, d2, equal_var = False) print('t-test not assuming equal variance:', p_value)
u, p_value = scipy.stats.mannwhitneyu(d1, d2)
print('Wilcoxon test:', p_value)
plt.title('Erste Grafik')
plt.xlabel('x')
plt.ylabel('y')
plt.axis([0, 6, 0, 20])
plt.plot([0, 1, 2, 3, 4], [0, 1, 4, 9, 16], 'k-', [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], 'ro') plt.show()
plt.close()
Plotting data
plt.show()
plt.close()
Second example
t = [0.02 * x for x in range(250)]
ft = [f(x) for x in t]
gt = [math.cos(2.0 * math.pi * x) for x in t]
plt.subplot(2, 1, 1)
plt.plot(t, ft, linestyle = '-', color = 'black', linewidth = 1) plt.plot(t, gt, linestyle = ':', color = 'blue', linewidth = 1)
▶ a: number of rows
▶ b: number of columns
▶ c: number of the active sub-plot