Hi I am trying to plot an x-y plot of the values a certain variable against bins. i.e. the x-axiz goes from 0 to 0.7 in increments of 0.02 while the y-axis is the average of values for all the points in that interval. Hence I first used cut to break the data into intervals, then I applied tapply using mean as the function and plotted the results. I also replaced mean with median. the 3 sets of functions that I used were However I am finding that the actual value plotted in the y-axis somehow does not seem to be correct? i.e. for example in the interval 0.38-0.4 there are a humungous number of points with y-axis value below 20 while there are very few with y-axis value above 20. However the median plotted is still around the 20 mark. It does not seem intuitive looking at the data that more than 50% of the points have a clock_rate (plotted on the y-axis) above 20. Is there something about the way these functions work with tapply, that I am missing? Any obvious mistakes that I should look for? SWfac <-cut(sorted_inp$age[1:290], seq(0, 0.7,0.02)) SLmean <- tapply(sorted_inp$clock_rate[1:290], SWfac, mean) plot(SLmean, type ="b", xaxt = "n") axis(1, seq(SLmean), levels(SWfac)) I tried a simple x-y scatter plot of the same 290 rows in excel (without binning them) and the concentration of points at lower values of clock rates does not seem to indicate that the medians should be as high as they are shown. Hoping to hear further Regards Lalitha