Example Data Details

This notebook presents some ways to use the package to give insights of the data

First, import the package

In [1]:
import dvb.datascience as ds
C:\ProgramData\Anaconda3\lib\site-packages\deap\tools\_hypervolume\pyhv.py:33: ImportWarning: Falling back to the python version of hypervolume module. Expect this to be very slow.
  "module. Expect this to be very slow.", ImportWarning)
C:\ProgramData\Anaconda3\lib\importlib\_bootstrap_external.py:426: ImportWarning: Not importing directory C:\ProgramData\Anaconda3\lib\site-packages\mpl_toolkits: missing __init__
  _warnings.warn(msg.format(portions[0]), ImportWarning)
C:\ProgramData\Anaconda3\lib\importlib\_bootstrap_external.py:426: ImportWarning: Not importing directory c:\programdata\anaconda3\lib\site-packages\mpl_toolkits: missing __init__
  _warnings.warn(msg.format(portions[0]), ImportWarning)

Describe the data

In [2]:
p = ds.Pipeline()
p.addPipe('read', ds.data.SampleData('iris'))
p.addPipe('describe', ds.eda.Describe(), [('read', 'df', 'df')])
p.transform(name="describe", close_plt=True)
'Drawing diagram using blockdiag'
_images/ExampleDataDetails_3_1.png

Transform describe

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
count 150.000000 150.000000 150.000000 150.000000 150.000000
mean 5.843333 3.054000 3.758667 1.198667 1.000000
std 0.828066 0.433594 1.764420 0.763161 0.819232
min 4.300000 2.000000 1.000000 0.100000 0.000000
25% 5.100000 2.800000 1.600000 0.300000 0.000000
50% 5.800000 3.000000 4.350000 1.300000 1.000000
75% 6.400000 3.300000 5.100000 1.800000 2.000000
max 7.900000 4.400000 6.900000 2.500000 2.000000

Dump the data

In [3]:
p = ds.Pipeline()
p.addPipe('read', ds.data.SampleData('iris'))
p.addPipe('dump', ds.eda.Dump(), [('read', 'df', 'df')])
p.transform(name="dump", close_plt=True)
'Drawing diagram using blockdiag'
_images/ExampleDataDetails_5_1.png

Transform dump

sepal length (cm) sepal width (cm) petal length (cm) petal width (cm) target
0 5.1 3.5 1.4 0.2 0
1 4.9 3.0 1.4 0.2 0
2 4.7 3.2 1.3 0.2 0
3 4.6 3.1 1.5 0.2 0
4 5.0 3.6 1.4 0.2 0
5 5.4 3.9 1.7 0.4 0
6 4.6 3.4 1.4 0.3 0
7 5.0 3.4 1.5 0.2 0
8 4.4 2.9 1.4 0.2 0
9 4.9 3.1 1.5 0.1 0
10 5.4 3.7 1.5 0.2 0
11 4.8 3.4 1.6 0.2 0
12 4.8 3.0 1.4 0.1 0
13 4.3 3.0 1.1 0.1 0
14 5.8 4.0 1.2 0.2 0
15 5.7 4.4 1.5 0.4 0
16 5.4 3.9 1.3 0.4 0
17 5.1 3.5 1.4 0.3 0
18 5.7 3.8 1.7 0.3 0
19 5.1 3.8 1.5 0.3 0
20 5.4 3.4 1.7 0.2 0
21 5.1 3.7 1.5 0.4 0
22 4.6 3.6 1.0 0.2 0
23 5.1 3.3 1.7 0.5 0
24 4.8 3.4 1.9 0.2 0
25 5.0 3.0 1.6 0.2 0
26 5.0 3.4 1.6 0.4 0
27 5.2 3.5 1.5 0.2 0
28 5.2 3.4 1.4 0.2 0
29 4.7 3.2 1.6 0.2 0
30 4.8 3.1 1.6 0.2 0
31 5.4 3.4 1.5 0.4 0
32 5.2 4.1 1.5 0.1 0
33 5.5 4.2 1.4 0.2 0
34 4.9 3.1 1.5 0.1 0
35 5.0 3.2 1.2 0.2 0
36 5.5 3.5 1.3 0.2 0
37 4.9 3.1 1.5 0.1 0
38 4.4 3.0 1.3 0.2 0
39 5.1 3.4 1.5 0.2 0
40 5.0 3.5 1.3 0.3 0
41 4.5 2.3 1.3 0.3 0
42 4.4 3.2 1.3 0.2 0
43 5.0 3.5 1.6 0.6 0
44 5.1 3.8 1.9 0.4 0
45 4.8 3.0 1.4 0.3 0
46 5.1 3.8 1.6 0.2 0
47 4.6 3.2 1.4 0.2 0
48 5.3 3.7 1.5 0.2 0
49 5.0 3.3 1.4 0.2 0
50 7.0 3.2 4.7 1.4 1
51 6.4 3.2 4.5 1.5 1
52 6.9 3.1 4.9 1.5 1
53 5.5 2.3 4.0 1.3 1
54 6.5 2.8 4.6 1.5 1
55 5.7 2.8 4.5 1.3 1
56 6.3 3.3 4.7 1.6 1
57 4.9 2.4 3.3 1.0 1
58 6.6 2.9 4.6 1.3 1
59 5.2 2.7 3.9 1.4 1
60 5.0 2.0 3.5 1.0 1
61 5.9 3.0 4.2 1.5 1
62 6.0 2.2 4.0 1.0 1
63 6.1 2.9 4.7 1.4 1
64 5.6 2.9 3.6 1.3 1
65 6.7 3.1 4.4 1.4 1
66 5.6 3.0 4.5 1.5 1
67 5.8 2.7 4.1 1.0 1
68 6.2 2.2 4.5 1.5 1
69 5.6 2.5 3.9 1.1 1
70 5.9 3.2 4.8 1.8 1
71 6.1 2.8 4.0 1.3 1
72 6.3 2.5 4.9 1.5 1
73 6.1 2.8 4.7 1.2 1
74 6.4 2.9 4.3 1.3 1
75 6.6 3.0 4.4 1.4 1
76 6.8 2.8 4.8 1.4 1
77 6.7 3.0 5.0 1.7 1
78 6.0 2.9 4.5 1.5 1
79 5.7 2.6 3.5 1.0 1
80 5.5 2.4 3.8 1.1 1
81 5.5 2.4 3.7 1.0 1
82 5.8 2.7 3.9 1.2 1
83 6.0 2.7 5.1 1.6 1
84 5.4 3.0 4.5 1.5 1
85 6.0 3.4 4.5 1.6 1
86 6.7 3.1 4.7 1.5 1
87 6.3 2.3 4.4 1.3 1
88 5.6 3.0 4.1 1.3 1
89 5.5 2.5 4.0 1.3 1
90 5.5 2.6 4.4 1.2 1
91 6.1 3.0 4.6 1.4 1
92 5.8 2.6 4.0 1.2 1
93 5.0 2.3 3.3 1.0 1
94 5.6 2.7 4.2 1.3 1
95 5.7 3.0 4.2 1.2 1
96 5.7 2.9 4.2 1.3 1
97 6.2 2.9 4.3 1.3 1
98 5.1 2.5 3.0 1.1 1
99 5.7 2.8 4.1 1.3 1
100 6.3 3.3 6.0 2.5 2
101 5.8 2.7 5.1 1.9 2
102 7.1 3.0 5.9 2.1 2
103 6.3 2.9 5.6 1.8 2
104 6.5 3.0 5.8 2.2 2
105 7.6 3.0 6.6 2.1 2
106 4.9 2.5 4.5 1.7 2
107 7.3 2.9 6.3 1.8 2
108 6.7 2.5 5.8 1.8 2
109 7.2 3.6 6.1 2.5 2
110 6.5 3.2 5.1 2.0 2
111 6.4 2.7 5.3 1.9 2
112 6.8 3.0 5.5 2.1 2
113 5.7 2.5 5.0 2.0 2
114 5.8 2.8 5.1 2.4 2
115 6.4 3.2 5.3 2.3 2
116 6.5 3.0 5.5 1.8 2
117 7.7 3.8 6.7 2.2 2
118 7.7 2.6 6.9 2.3 2
119 6.0 2.2 5.0 1.5 2
120 6.9 3.2 5.7 2.3 2
121 5.6 2.8 4.9 2.0 2
122 7.7 2.8 6.7 2.0 2
123 6.3 2.7 4.9 1.8 2
124 6.7 3.3 5.7 2.1 2
125 7.2 3.2 6.0 1.8 2
126 6.2 2.8 4.8 1.8 2
127 6.1 3.0 4.9 1.8 2
128 6.4 2.8 5.6 2.1 2
129 7.2 3.0 5.8 1.6 2
130 7.4 2.8 6.1 1.9 2
131 7.9 3.8 6.4 2.0 2
132 6.4 2.8 5.6 2.2 2
133 6.3 2.8 5.1 1.5 2
134 6.1 2.6 5.6 1.4 2
135 7.7 3.0 6.1 2.3 2
136 6.3 3.4 5.6 2.4 2
137 6.4 3.1 5.5 1.8 2
138 6.0 3.0 4.8 1.8 2
139 6.9 3.1 5.4 2.1 2
140 6.7 3.1 5.6 2.4 2
141 6.9 3.1 5.1 2.3 2
142 5.8 2.7 5.1 1.9 2
143 6.8 3.2 5.9 2.3 2
144 6.7 3.3 5.7 2.5 2
145 6.7 3.0 5.2 2.3 2
146 6.3 2.5 5.0 1.9 2
147 6.5 3.0 5.2 2.0 2
148 6.2 3.4 5.4 2.3 2
149 5.9 3.0 5.1 1.8 2