transform histogram to violin plot in r with ggplot
I am currently trying to learn r with the help of Hadley Wickham's great resources ("r for data scientists", "ggplot2 Elegant Graphics for Data Analysis"). So far I was able to find answers to all my problems there (thank you so much, Hadley!), but not this time.
Currently, I am working with data from an instrument that estimates particle size by the light the particles scatter (DLS, Zetasizer Nano, Malvern Instruments). The data extracted from this device are some summary statistics (e.g. mean particle size) and histogram data: x = size (split in bins), y = intensity [%].
Here is a tibble of one of my measurements:
# A tibble: 70 x 3
sample_name intensities bins
<chr> <dbl> <dbl>
1 core formulation 1 0 0.4
2 core formulation 1 0 0.463
3 core formulation 1 0 0.536
4 core formulation 1 0 0.621
5 core formulation 1 0 0.720
6 core formulation 1 0 0.833
7 core formulation 1 0 0.965
8 core formulation 1 0 1.12
9 core formulation 1 0 1.29
10 core formulation 1 0 1.50
11 core formulation 1 0 1.74
12 core formulation 1 0 2.01
13 core formulation 1 0 2.33
14 core formulation 1 0 2.70
15 core formulation 1 0 3.12
16 core formulation 1 0 3.62
17 core formulation 1 0 4.19
18 core formulation 1 0 4.85
19 core formulation 1 0 5.62
20 core formulation 1 0 6.50
21 core formulation 1 0 7.53
22 core formulation 1 0 8.72
23 core formulation 1 0 10.1
24 core formulation 1 0 11.7
25 core formulation 1 0 13.5
26 core formulation 1 0 15.7
27 core formulation 1 0 18.2
28 core formulation 1 0 21.0
29 core formulation 1 0 24.4
30 core formulation 1 0 28.2
31 core formulation 1 0 32.7
32 core formulation 1 0 37.8
33 core formulation 1 0 43.8
34 core formulation 1 0.2 50.8
35 core formulation 1 1.4 58.8
36 core formulation 1 3.7 68.1
37 core formulation 1 6.9 78.8
38 core formulation 1 10.2 91.3
39 core formulation 1 12.9 106.
40 core formulation 1 14.4 122.
41 core formulation 1 14.4 142.
42 core formulation 1 13 164.
43 core formulation 1 10.3 190.
44 core formulation 1 7.1 220.
45 core formulation 1 3.9 255
46 core formulation 1 1.5 295.
47 core formulation 1 0.2 342
48 core formulation 1 0 396.
49 core formulation 1 0 459.
50 core formulation 1 0 531.
51 core formulation 1 0 615.
52 core formulation 1 0 712.
53 core formulation 1 0 825
54 core formulation 1 0 955.
55 core formulation 1 0 1106
56 core formulation 1 0 1281
57 core formulation 1 0 1484
58 core formulation 1 0 1718
59 core formulation 1 0 1990
60 core formulation 1 0 2305
61 core formulation 1 0 2669
62 core formulation 1 0 3091
63 core formulation 1 0 3580
64 core formulation 1 0 4145
65 core formulation 1 0 4801
66 core formulation 1 0 5560
67 core formulation 1 0 6439
68 core formulation 1 0 7456
69 core formulation 1 0 8635
70 core formulation 1 0 10000
Here is the data produced with the dput()
command:
structure(list(sample_name = c("core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1"), intensities = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.2, 1.4, 3.7, 6.9, 10.2, 12.9,
14.4, 14.4, 13, 10.3, 7.1, 3.9, 1.5, 0.2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), bins = c(0.4,
0.4632, 0.5365, 0.6213, 0.7195, 0.8332, 0.9649, 1.117, 1.294,
1.499, 1.736, 2.01, 2.328, 2.696, 3.122, 3.615, 4.187, 4.849,
5.615, 6.503, 7.531, 8.721, 10.1, 11.7, 13.54, 15.69, 18.17,
21.04, 24.36, 28.21, 32.67, 37.84, 43.82, 50.75, 58.77, 68.06,
78.82, 91.28, 105.7, 122.4, 141.8, 164.2, 190.1, 220.2, 255,
295.3, 342, 396.1, 458.7, 531.2, 615.1, 712.4, 825, 955.4, 1106,
1281, 1484, 1718, 1990, 2305, 2669, 3091, 3580, 4145, 4801, 5560,
6439, 7456, 8635, 10000)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -70L))
I can produce a histogram with no problems from this data:
library(tidyverse)
ggplot (DLS_intensities_core, aes(bins,intensities) ) +
geom_line() +
scale_x_continuous(trans = 'log10')
In order to show the overall distribution of my particle size, I would like to convert this data into a violin plot and use the summary statistics provided by the device in a second layer of my plot.
Therefore, I would like to transform this data to be able to create a violin plot from it.
I have already tried feeding it to the stat_density () argument of the violin plot but so far with no success.
Do you know how to create a violin plot from this data?
Thank you very much!
Best,
Dominik
r ggplot2 histogram transformation violin-plot
add a comment |
I am currently trying to learn r with the help of Hadley Wickham's great resources ("r for data scientists", "ggplot2 Elegant Graphics for Data Analysis"). So far I was able to find answers to all my problems there (thank you so much, Hadley!), but not this time.
Currently, I am working with data from an instrument that estimates particle size by the light the particles scatter (DLS, Zetasizer Nano, Malvern Instruments). The data extracted from this device are some summary statistics (e.g. mean particle size) and histogram data: x = size (split in bins), y = intensity [%].
Here is a tibble of one of my measurements:
# A tibble: 70 x 3
sample_name intensities bins
<chr> <dbl> <dbl>
1 core formulation 1 0 0.4
2 core formulation 1 0 0.463
3 core formulation 1 0 0.536
4 core formulation 1 0 0.621
5 core formulation 1 0 0.720
6 core formulation 1 0 0.833
7 core formulation 1 0 0.965
8 core formulation 1 0 1.12
9 core formulation 1 0 1.29
10 core formulation 1 0 1.50
11 core formulation 1 0 1.74
12 core formulation 1 0 2.01
13 core formulation 1 0 2.33
14 core formulation 1 0 2.70
15 core formulation 1 0 3.12
16 core formulation 1 0 3.62
17 core formulation 1 0 4.19
18 core formulation 1 0 4.85
19 core formulation 1 0 5.62
20 core formulation 1 0 6.50
21 core formulation 1 0 7.53
22 core formulation 1 0 8.72
23 core formulation 1 0 10.1
24 core formulation 1 0 11.7
25 core formulation 1 0 13.5
26 core formulation 1 0 15.7
27 core formulation 1 0 18.2
28 core formulation 1 0 21.0
29 core formulation 1 0 24.4
30 core formulation 1 0 28.2
31 core formulation 1 0 32.7
32 core formulation 1 0 37.8
33 core formulation 1 0 43.8
34 core formulation 1 0.2 50.8
35 core formulation 1 1.4 58.8
36 core formulation 1 3.7 68.1
37 core formulation 1 6.9 78.8
38 core formulation 1 10.2 91.3
39 core formulation 1 12.9 106.
40 core formulation 1 14.4 122.
41 core formulation 1 14.4 142.
42 core formulation 1 13 164.
43 core formulation 1 10.3 190.
44 core formulation 1 7.1 220.
45 core formulation 1 3.9 255
46 core formulation 1 1.5 295.
47 core formulation 1 0.2 342
48 core formulation 1 0 396.
49 core formulation 1 0 459.
50 core formulation 1 0 531.
51 core formulation 1 0 615.
52 core formulation 1 0 712.
53 core formulation 1 0 825
54 core formulation 1 0 955.
55 core formulation 1 0 1106
56 core formulation 1 0 1281
57 core formulation 1 0 1484
58 core formulation 1 0 1718
59 core formulation 1 0 1990
60 core formulation 1 0 2305
61 core formulation 1 0 2669
62 core formulation 1 0 3091
63 core formulation 1 0 3580
64 core formulation 1 0 4145
65 core formulation 1 0 4801
66 core formulation 1 0 5560
67 core formulation 1 0 6439
68 core formulation 1 0 7456
69 core formulation 1 0 8635
70 core formulation 1 0 10000
Here is the data produced with the dput()
command:
structure(list(sample_name = c("core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1"), intensities = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.2, 1.4, 3.7, 6.9, 10.2, 12.9,
14.4, 14.4, 13, 10.3, 7.1, 3.9, 1.5, 0.2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), bins = c(0.4,
0.4632, 0.5365, 0.6213, 0.7195, 0.8332, 0.9649, 1.117, 1.294,
1.499, 1.736, 2.01, 2.328, 2.696, 3.122, 3.615, 4.187, 4.849,
5.615, 6.503, 7.531, 8.721, 10.1, 11.7, 13.54, 15.69, 18.17,
21.04, 24.36, 28.21, 32.67, 37.84, 43.82, 50.75, 58.77, 68.06,
78.82, 91.28, 105.7, 122.4, 141.8, 164.2, 190.1, 220.2, 255,
295.3, 342, 396.1, 458.7, 531.2, 615.1, 712.4, 825, 955.4, 1106,
1281, 1484, 1718, 1990, 2305, 2669, 3091, 3580, 4145, 4801, 5560,
6439, 7456, 8635, 10000)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -70L))
I can produce a histogram with no problems from this data:
library(tidyverse)
ggplot (DLS_intensities_core, aes(bins,intensities) ) +
geom_line() +
scale_x_continuous(trans = 'log10')
In order to show the overall distribution of my particle size, I would like to convert this data into a violin plot and use the summary statistics provided by the device in a second layer of my plot.
Therefore, I would like to transform this data to be able to create a violin plot from it.
I have already tried feeding it to the stat_density () argument of the violin plot but so far with no success.
Do you know how to create a violin plot from this data?
Thank you very much!
Best,
Dominik
r ggplot2 histogram transformation violin-plot
1
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
1
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
or are there some other transformations you needed performed on them?
– hrbrmstr
Nov 25 '18 at 17:30
Thanks :-) I think I am just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)
– Dome
Nov 25 '18 at 17:43
add a comment |
I am currently trying to learn r with the help of Hadley Wickham's great resources ("r for data scientists", "ggplot2 Elegant Graphics for Data Analysis"). So far I was able to find answers to all my problems there (thank you so much, Hadley!), but not this time.
Currently, I am working with data from an instrument that estimates particle size by the light the particles scatter (DLS, Zetasizer Nano, Malvern Instruments). The data extracted from this device are some summary statistics (e.g. mean particle size) and histogram data: x = size (split in bins), y = intensity [%].
Here is a tibble of one of my measurements:
# A tibble: 70 x 3
sample_name intensities bins
<chr> <dbl> <dbl>
1 core formulation 1 0 0.4
2 core formulation 1 0 0.463
3 core formulation 1 0 0.536
4 core formulation 1 0 0.621
5 core formulation 1 0 0.720
6 core formulation 1 0 0.833
7 core formulation 1 0 0.965
8 core formulation 1 0 1.12
9 core formulation 1 0 1.29
10 core formulation 1 0 1.50
11 core formulation 1 0 1.74
12 core formulation 1 0 2.01
13 core formulation 1 0 2.33
14 core formulation 1 0 2.70
15 core formulation 1 0 3.12
16 core formulation 1 0 3.62
17 core formulation 1 0 4.19
18 core formulation 1 0 4.85
19 core formulation 1 0 5.62
20 core formulation 1 0 6.50
21 core formulation 1 0 7.53
22 core formulation 1 0 8.72
23 core formulation 1 0 10.1
24 core formulation 1 0 11.7
25 core formulation 1 0 13.5
26 core formulation 1 0 15.7
27 core formulation 1 0 18.2
28 core formulation 1 0 21.0
29 core formulation 1 0 24.4
30 core formulation 1 0 28.2
31 core formulation 1 0 32.7
32 core formulation 1 0 37.8
33 core formulation 1 0 43.8
34 core formulation 1 0.2 50.8
35 core formulation 1 1.4 58.8
36 core formulation 1 3.7 68.1
37 core formulation 1 6.9 78.8
38 core formulation 1 10.2 91.3
39 core formulation 1 12.9 106.
40 core formulation 1 14.4 122.
41 core formulation 1 14.4 142.
42 core formulation 1 13 164.
43 core formulation 1 10.3 190.
44 core formulation 1 7.1 220.
45 core formulation 1 3.9 255
46 core formulation 1 1.5 295.
47 core formulation 1 0.2 342
48 core formulation 1 0 396.
49 core formulation 1 0 459.
50 core formulation 1 0 531.
51 core formulation 1 0 615.
52 core formulation 1 0 712.
53 core formulation 1 0 825
54 core formulation 1 0 955.
55 core formulation 1 0 1106
56 core formulation 1 0 1281
57 core formulation 1 0 1484
58 core formulation 1 0 1718
59 core formulation 1 0 1990
60 core formulation 1 0 2305
61 core formulation 1 0 2669
62 core formulation 1 0 3091
63 core formulation 1 0 3580
64 core formulation 1 0 4145
65 core formulation 1 0 4801
66 core formulation 1 0 5560
67 core formulation 1 0 6439
68 core formulation 1 0 7456
69 core formulation 1 0 8635
70 core formulation 1 0 10000
Here is the data produced with the dput()
command:
structure(list(sample_name = c("core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1"), intensities = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.2, 1.4, 3.7, 6.9, 10.2, 12.9,
14.4, 14.4, 13, 10.3, 7.1, 3.9, 1.5, 0.2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), bins = c(0.4,
0.4632, 0.5365, 0.6213, 0.7195, 0.8332, 0.9649, 1.117, 1.294,
1.499, 1.736, 2.01, 2.328, 2.696, 3.122, 3.615, 4.187, 4.849,
5.615, 6.503, 7.531, 8.721, 10.1, 11.7, 13.54, 15.69, 18.17,
21.04, 24.36, 28.21, 32.67, 37.84, 43.82, 50.75, 58.77, 68.06,
78.82, 91.28, 105.7, 122.4, 141.8, 164.2, 190.1, 220.2, 255,
295.3, 342, 396.1, 458.7, 531.2, 615.1, 712.4, 825, 955.4, 1106,
1281, 1484, 1718, 1990, 2305, 2669, 3091, 3580, 4145, 4801, 5560,
6439, 7456, 8635, 10000)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -70L))
I can produce a histogram with no problems from this data:
library(tidyverse)
ggplot (DLS_intensities_core, aes(bins,intensities) ) +
geom_line() +
scale_x_continuous(trans = 'log10')
In order to show the overall distribution of my particle size, I would like to convert this data into a violin plot and use the summary statistics provided by the device in a second layer of my plot.
Therefore, I would like to transform this data to be able to create a violin plot from it.
I have already tried feeding it to the stat_density () argument of the violin plot but so far with no success.
Do you know how to create a violin plot from this data?
Thank you very much!
Best,
Dominik
r ggplot2 histogram transformation violin-plot
I am currently trying to learn r with the help of Hadley Wickham's great resources ("r for data scientists", "ggplot2 Elegant Graphics for Data Analysis"). So far I was able to find answers to all my problems there (thank you so much, Hadley!), but not this time.
Currently, I am working with data from an instrument that estimates particle size by the light the particles scatter (DLS, Zetasizer Nano, Malvern Instruments). The data extracted from this device are some summary statistics (e.g. mean particle size) and histogram data: x = size (split in bins), y = intensity [%].
Here is a tibble of one of my measurements:
# A tibble: 70 x 3
sample_name intensities bins
<chr> <dbl> <dbl>
1 core formulation 1 0 0.4
2 core formulation 1 0 0.463
3 core formulation 1 0 0.536
4 core formulation 1 0 0.621
5 core formulation 1 0 0.720
6 core formulation 1 0 0.833
7 core formulation 1 0 0.965
8 core formulation 1 0 1.12
9 core formulation 1 0 1.29
10 core formulation 1 0 1.50
11 core formulation 1 0 1.74
12 core formulation 1 0 2.01
13 core formulation 1 0 2.33
14 core formulation 1 0 2.70
15 core formulation 1 0 3.12
16 core formulation 1 0 3.62
17 core formulation 1 0 4.19
18 core formulation 1 0 4.85
19 core formulation 1 0 5.62
20 core formulation 1 0 6.50
21 core formulation 1 0 7.53
22 core formulation 1 0 8.72
23 core formulation 1 0 10.1
24 core formulation 1 0 11.7
25 core formulation 1 0 13.5
26 core formulation 1 0 15.7
27 core formulation 1 0 18.2
28 core formulation 1 0 21.0
29 core formulation 1 0 24.4
30 core formulation 1 0 28.2
31 core formulation 1 0 32.7
32 core formulation 1 0 37.8
33 core formulation 1 0 43.8
34 core formulation 1 0.2 50.8
35 core formulation 1 1.4 58.8
36 core formulation 1 3.7 68.1
37 core formulation 1 6.9 78.8
38 core formulation 1 10.2 91.3
39 core formulation 1 12.9 106.
40 core formulation 1 14.4 122.
41 core formulation 1 14.4 142.
42 core formulation 1 13 164.
43 core formulation 1 10.3 190.
44 core formulation 1 7.1 220.
45 core formulation 1 3.9 255
46 core formulation 1 1.5 295.
47 core formulation 1 0.2 342
48 core formulation 1 0 396.
49 core formulation 1 0 459.
50 core formulation 1 0 531.
51 core formulation 1 0 615.
52 core formulation 1 0 712.
53 core formulation 1 0 825
54 core formulation 1 0 955.
55 core formulation 1 0 1106
56 core formulation 1 0 1281
57 core formulation 1 0 1484
58 core formulation 1 0 1718
59 core formulation 1 0 1990
60 core formulation 1 0 2305
61 core formulation 1 0 2669
62 core formulation 1 0 3091
63 core formulation 1 0 3580
64 core formulation 1 0 4145
65 core formulation 1 0 4801
66 core formulation 1 0 5560
67 core formulation 1 0 6439
68 core formulation 1 0 7456
69 core formulation 1 0 8635
70 core formulation 1 0 10000
Here is the data produced with the dput()
command:
structure(list(sample_name = c("core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1", "core formulation 1",
"core formulation 1", "core formulation 1"), intensities = c(0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.2, 1.4, 3.7, 6.9, 10.2, 12.9,
14.4, 14.4, 13, 10.3, 7.1, 3.9, 1.5, 0.2, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), bins = c(0.4,
0.4632, 0.5365, 0.6213, 0.7195, 0.8332, 0.9649, 1.117, 1.294,
1.499, 1.736, 2.01, 2.328, 2.696, 3.122, 3.615, 4.187, 4.849,
5.615, 6.503, 7.531, 8.721, 10.1, 11.7, 13.54, 15.69, 18.17,
21.04, 24.36, 28.21, 32.67, 37.84, 43.82, 50.75, 58.77, 68.06,
78.82, 91.28, 105.7, 122.4, 141.8, 164.2, 190.1, 220.2, 255,
295.3, 342, 396.1, 458.7, 531.2, 615.1, 712.4, 825, 955.4, 1106,
1281, 1484, 1718, 1990, 2305, 2669, 3091, 3580, 4145, 4801, 5560,
6439, 7456, 8635, 10000)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -70L))
I can produce a histogram with no problems from this data:
library(tidyverse)
ggplot (DLS_intensities_core, aes(bins,intensities) ) +
geom_line() +
scale_x_continuous(trans = 'log10')
In order to show the overall distribution of my particle size, I would like to convert this data into a violin plot and use the summary statistics provided by the device in a second layer of my plot.
Therefore, I would like to transform this data to be able to create a violin plot from it.
I have already tried feeding it to the stat_density () argument of the violin plot but so far with no success.
Do you know how to create a violin plot from this data?
Thank you very much!
Best,
Dominik
r ggplot2 histogram transformation violin-plot
r ggplot2 histogram transformation violin-plot
edited Nov 25 '18 at 17:01
Dome
asked Nov 25 '18 at 16:30
DomeDome
115
115
1
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
1
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
or are there some other transformations you needed performed on them?
– hrbrmstr
Nov 25 '18 at 17:30
Thanks :-) I think I am just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)
– Dome
Nov 25 '18 at 17:43
add a comment |
1
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
1
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
or are there some other transformations you needed performed on them?
– hrbrmstr
Nov 25 '18 at 17:30
Thanks :-) I think I am just looking for a violin plot ofDLS_intensities_core$bins
andDLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)
– Dome
Nov 25 '18 at 17:43
1
1
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
1
1
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot of
DLS_intensities_core$bins
and DLS_intensities_core$intensities
or are there some other transformations you needed performed on them?– hrbrmstr
Nov 25 '18 at 17:30
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot of
DLS_intensities_core$bins
and DLS_intensities_core$intensities
or are there some other transformations you needed performed on them?– hrbrmstr
Nov 25 '18 at 17:30
Thanks :-) I think I am just looking for a violin plot of
DLS_intensities_core$bins
and DLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)– Dome
Nov 25 '18 at 17:43
Thanks :-) I think I am just looking for a violin plot of
DLS_intensities_core$bins
and DLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)– Dome
Nov 25 '18 at 17:43
add a comment |
2 Answers
2
active
oldest
votes
I'll update this (if needed) after your reply to the second comment. You can get violin plots of both bins
and intensities
with:
library(hrbrthemes)
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
I generally like to layer in the points as well:
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom() +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
Per your comment, perhaps this might be a better way to show the bins
distribution along with the relationship with intensities
:
library(hrbrthemes)
library(tidyverse)
ggplot(DLS_intensities_core, aes(x="", bins)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom(
aes(size = intensities, fill = intensities), shape = 21
) +
scale_y_comma(trans="log10") +
viridis::scale_fill_viridis(direction = -1, trans = "log1p") +
scale_size_continuous(trans = "log1p", range = c(2, 10)) +
guides(fill = guide_legend()) +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this"
) +
theme_ipsum_rc(grid="Y")
You'd have to do some other, custom transformation to try to get the violin shape to vary with the intensities (and it wouldn't really be reflecting the distribution at that point).
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
add a comment |
I found a solution to my problem, it is probably not very elegant:
library (tidyverse)
DLS_intensities_core <- DLS_intensities_core %>%
mutate(counts = intensities * 10 )
vectors <- DLS_intensities_core %>%
filter(counts > 0)
bins_v <- vectors$bins
count_v <- vectors$counts
violin_DLSdata <- as.tibble(rep.int(bins_v, count_v))
violin_DLSdata$sample_name <- "core formulation 1"
ggplot (violin_DLSdata, aes(sample_name, value)) +
geom_violin() +
labs(
x = NULL, y = "size"
) +
scale_y_continuous(trans = 'log10', limits = c(1, 1000))
for my whole dataset it looks like this:
I have added: summary statistic as red dots with errorbars.
What do you think?
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469540%2ftransform-histogram-to-violin-plot-in-r-with-ggplot%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
I'll update this (if needed) after your reply to the second comment. You can get violin plots of both bins
and intensities
with:
library(hrbrthemes)
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
I generally like to layer in the points as well:
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom() +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
Per your comment, perhaps this might be a better way to show the bins
distribution along with the relationship with intensities
:
library(hrbrthemes)
library(tidyverse)
ggplot(DLS_intensities_core, aes(x="", bins)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom(
aes(size = intensities, fill = intensities), shape = 21
) +
scale_y_comma(trans="log10") +
viridis::scale_fill_viridis(direction = -1, trans = "log1p") +
scale_size_continuous(trans = "log1p", range = c(2, 10)) +
guides(fill = guide_legend()) +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this"
) +
theme_ipsum_rc(grid="Y")
You'd have to do some other, custom transformation to try to get the violin shape to vary with the intensities (and it wouldn't really be reflecting the distribution at that point).
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
add a comment |
I'll update this (if needed) after your reply to the second comment. You can get violin plots of both bins
and intensities
with:
library(hrbrthemes)
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
I generally like to layer in the points as well:
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom() +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
Per your comment, perhaps this might be a better way to show the bins
distribution along with the relationship with intensities
:
library(hrbrthemes)
library(tidyverse)
ggplot(DLS_intensities_core, aes(x="", bins)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom(
aes(size = intensities, fill = intensities), shape = 21
) +
scale_y_comma(trans="log10") +
viridis::scale_fill_viridis(direction = -1, trans = "log1p") +
scale_size_continuous(trans = "log1p", range = c(2, 10)) +
guides(fill = guide_legend()) +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this"
) +
theme_ipsum_rc(grid="Y")
You'd have to do some other, custom transformation to try to get the violin shape to vary with the intensities (and it wouldn't really be reflecting the distribution at that point).
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
add a comment |
I'll update this (if needed) after your reply to the second comment. You can get violin plots of both bins
and intensities
with:
library(hrbrthemes)
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
I generally like to layer in the points as well:
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom() +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
Per your comment, perhaps this might be a better way to show the bins
distribution along with the relationship with intensities
:
library(hrbrthemes)
library(tidyverse)
ggplot(DLS_intensities_core, aes(x="", bins)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom(
aes(size = intensities, fill = intensities), shape = 21
) +
scale_y_comma(trans="log10") +
viridis::scale_fill_viridis(direction = -1, trans = "log1p") +
scale_size_continuous(trans = "log1p", range = c(2, 10)) +
guides(fill = guide_legend()) +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this"
) +
theme_ipsum_rc(grid="Y")
You'd have to do some other, custom transformation to try to get the violin shape to vary with the intensities (and it wouldn't really be reflecting the distribution at that point).
I'll update this (if needed) after your reply to the second comment. You can get violin plots of both bins
and intensities
with:
library(hrbrthemes)
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
I generally like to layer in the points as well:
gather(DLS_intensities_core, measure, value, -sample_name) %>%
ggplot(aes(measure, value)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom() +
scale_y_comma() +
facet_wrap(~measure, scales="free") +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this",
caption = "NOTE: Free Y scales"
) +
theme_ipsum_rc(grid="Y") +
theme(axis.text.x = element_blank())
Per your comment, perhaps this might be a better way to show the bins
distribution along with the relationship with intensities
:
library(hrbrthemes)
library(tidyverse)
ggplot(DLS_intensities_core, aes(x="", bins)) +
geom_violin(scale = "count") +
ggbeeswarm::geom_quasirandom(
aes(size = intensities, fill = intensities), shape = 21
) +
scale_y_comma(trans="log10") +
viridis::scale_fill_viridis(direction = -1, trans = "log1p") +
scale_size_continuous(trans = "log1p", range = c(2, 10)) +
guides(fill = guide_legend()) +
labs(
x = NULL, y = "A better label than this",
title = "A better title than this"
) +
theme_ipsum_rc(grid="Y")
You'd have to do some other, custom transformation to try to get the violin shape to vary with the intensities (and it wouldn't really be reflecting the distribution at that point).
edited Nov 25 '18 at 18:22
answered Nov 25 '18 at 17:36
hrbrmstrhrbrmstr
61.3k691152
61.3k691152
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
add a comment |
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
thank you very much for your plots and the associated code, I like the theme of your plots very much! And the addition of dots to violin plots is a good idea :-) The first plot (bins) is really close to what I want to have: Size (or bins) on the y -axis (in log10 scale) and the expansion in x-direction of the plot is the value of the intensity (not the number of cases). Does this help or is it just the wrong plot for this idea?
– Dome
Nov 25 '18 at 18:11
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
See if the latest update does a bit more of what you want.
– hrbrmstr
Nov 25 '18 at 18:23
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
yes, this goes in the right direction! However, I would like the intensities to be depicted by the extension of the violin plot in x direction (in a way that the biggest extension of the violin plot is at around 100 on the y-axis, at the position that you printed the big dots at.).
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
I had another thought while in the shower: Maybe my tibble is not in the right format for this type of plot. Since the violin plot's stat counts rows and depicts a summary, the most straight forward way would be to multiplay the bin size by intensity and create the corresponding amount of rows with this precise (bin-) size. E.g. if the size is 100 and the corresponding intensity 20, then I would create 20 new rows with bin = 100. I will try this approach today and let you know what I found :-) Thanks again for your help! :-)
– Dome
Nov 26 '18 at 9:18
add a comment |
I found a solution to my problem, it is probably not very elegant:
library (tidyverse)
DLS_intensities_core <- DLS_intensities_core %>%
mutate(counts = intensities * 10 )
vectors <- DLS_intensities_core %>%
filter(counts > 0)
bins_v <- vectors$bins
count_v <- vectors$counts
violin_DLSdata <- as.tibble(rep.int(bins_v, count_v))
violin_DLSdata$sample_name <- "core formulation 1"
ggplot (violin_DLSdata, aes(sample_name, value)) +
geom_violin() +
labs(
x = NULL, y = "size"
) +
scale_y_continuous(trans = 'log10', limits = c(1, 1000))
for my whole dataset it looks like this:
I have added: summary statistic as red dots with errorbars.
What do you think?
add a comment |
I found a solution to my problem, it is probably not very elegant:
library (tidyverse)
DLS_intensities_core <- DLS_intensities_core %>%
mutate(counts = intensities * 10 )
vectors <- DLS_intensities_core %>%
filter(counts > 0)
bins_v <- vectors$bins
count_v <- vectors$counts
violin_DLSdata <- as.tibble(rep.int(bins_v, count_v))
violin_DLSdata$sample_name <- "core formulation 1"
ggplot (violin_DLSdata, aes(sample_name, value)) +
geom_violin() +
labs(
x = NULL, y = "size"
) +
scale_y_continuous(trans = 'log10', limits = c(1, 1000))
for my whole dataset it looks like this:
I have added: summary statistic as red dots with errorbars.
What do you think?
add a comment |
I found a solution to my problem, it is probably not very elegant:
library (tidyverse)
DLS_intensities_core <- DLS_intensities_core %>%
mutate(counts = intensities * 10 )
vectors <- DLS_intensities_core %>%
filter(counts > 0)
bins_v <- vectors$bins
count_v <- vectors$counts
violin_DLSdata <- as.tibble(rep.int(bins_v, count_v))
violin_DLSdata$sample_name <- "core formulation 1"
ggplot (violin_DLSdata, aes(sample_name, value)) +
geom_violin() +
labs(
x = NULL, y = "size"
) +
scale_y_continuous(trans = 'log10', limits = c(1, 1000))
for my whole dataset it looks like this:
I have added: summary statistic as red dots with errorbars.
What do you think?
I found a solution to my problem, it is probably not very elegant:
library (tidyverse)
DLS_intensities_core <- DLS_intensities_core %>%
mutate(counts = intensities * 10 )
vectors <- DLS_intensities_core %>%
filter(counts > 0)
bins_v <- vectors$bins
count_v <- vectors$counts
violin_DLSdata <- as.tibble(rep.int(bins_v, count_v))
violin_DLSdata$sample_name <- "core formulation 1"
ggplot (violin_DLSdata, aes(sample_name, value)) +
geom_violin() +
labs(
x = NULL, y = "size"
) +
scale_y_continuous(trans = 'log10', limits = c(1, 1000))
for my whole dataset it looks like this:
I have added: summary statistic as red dots with errorbars.
What do you think?
edited Nov 27 '18 at 14:16
answered Nov 26 '18 at 10:29
DomeDome
115
115
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469540%2ftransform-histogram-to-violin-plot-in-r-with-ggplot%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Welcome to SO! Could you jump over to stackoverflow.com/tags/r/info and take a look at the suggestions for how to post data (some are in links off of that URL)? I think the SO R contributors could make quick work of this if the data was in a tad easier format to work with.
– hrbrmstr
Nov 25 '18 at 16:33
I am sorry, I will add it immediately! thank you for pointing it out!
– Dome
Nov 25 '18 at 16:39
1
Wow! Best data dump of the day here! (and, no apologies are required. the SO question entry screen leaves much to be desired). Are you just looking for a violin plot of
DLS_intensities_core$bins
andDLS_intensities_core$intensities
or are there some other transformations you needed performed on them?– hrbrmstr
Nov 25 '18 at 17:30
Thanks :-) I think I am just looking for a violin plot of
DLS_intensities_core$bins
andDLS_intensities_core$intensities
. In the end, I want to produce a plot with the following specs: x-axis: particle size [nm]; y-axis: sample name (e.g. "core formulation"); 1st. layer: violin plot (from the data I presented here); 2nd. layer: summary statistics (calculated by the Zetasizer, not shown here for clarity) That means I will need to do some additional transformations for sure, but I want to try it myself first ;-)– Dome
Nov 25 '18 at 17:43