Hi!
> Using R, I plotted a log-log plot of the frequencies in the Brown Corpus
> using
> plot(sort(file.tfl$f, decreasing=TRUE), xlab="rank",
ylab="frequency",
> log="x,y")
> However, I would also like to add lines showing the curves for a Zipfian
> distribution and for Zipf-Mandelbrot.
It's fairly straightforward to add such curves to the plot above with
lines(), e.g. for Zipf-Mandelbrot
k <- 1:length(file.tfl$f)
f <- C / (k + b)^a # Zipf-Mandelbrot law with parameters a >= 1, b
>= 0, C
lines(k, f, lwd=2, col="red")
The tricky part is to determine suitable values for the parameters a, b and C.
If you happen to be using the "zipfR" package (just guessing because
of the .tfl terminology in your code example), you can easily get an
approximation to the Zipf-Mandelbrot law from a trained ZM model (the package
does not offer a valid LNRE model for Zipf's original law). In essence,
this is what you have to do:
file.zm <- lnre("zm", tfl2spc(file.tfl)) # assuming that
file.tfl is a "tfl" object created by zipfR
k <- 1:length(file.tfl$f)
f <- tqlnre(file.zm, k) * N(file.tfl)
lines(k, f, lwd=2, col="red")
Hope this helps,
Stefan