data(faithful)
x <- faithful$eruptions
boxplot(x)
hist(x)
hist(x, breaks=seq(0, 6, by=0.5))
# The effect of 'sliding breakpoints'
par(mfrow=c(2,3))
for(shift in seq(0, 0.8, by=0.2)){
hist(x, breaks=seq(0, 6, by=0.5)+shift, col='red')
}
par(mfrow=c(1,1))
hist(x, nclass=20, prob=TRUE, col='gray', border='white')
rug(x)
lines(density(x))
4.3.2 Lab Exercise
Do the following:
1. Run the above.
2. Experiment with different bandwidth.
3. What is the best bandwidth from visual inspection?
4. Each table comes up with their own 'optimal' bandwidth.
5. Find out what 'rug' function does.
6. What if we want to plot only density, not histogram?
7. What are the names of 'density' object? (help page + names(..) function)
library(MASS)
data(galaxies)
gal <- galaxies/1000
median(gal)
hist(gal, prob=TRUE, ylim=c(0, 0.3), xlim=c(0,40))
plot(x = c(0, 40), y = c(0, 0.3), type = "n", bty = "l",
xlab = "velocity of galaxy (1000km/s)", ylab = "density")
#skip
rug(gal)
lines(density(gal, width = 3.25, n = 200), lty = 1)
lines(density(gal, width = 2.56, n = 200), lty = 3) # skip
set.seed(101)
m <- 1000
res <- numeric(m)
for(i in 1:m) res[i] <- median(sample(gal, replace=T))
mean(res - median(gal))
sqrt(var(res))
truehist(res, h=0.1)
lines(density(res, width="SJ-dpi", n=256))
5.1.2 Lab Exercise
Do the following:
1. Run the above.
2. Experiment with different bandwidth.
3. What is the best bandwidth from visual inspection?
4. What can you tell about the bias and variance of the estimator?
5. Why is set.seed used here?