You are currently browsing the category archive for the ‘Blog’ category.

Ji vis dėlto bus! Jau galima registruotis. Visa informacija olimpiados puslapyje. Skleiskite šią džiaugsmingą žinią. Kas nori kuo nors padėti mielai kviečiam, kol kas organizavimas yra one man show.

Bediskutuojant su pirmakursiais magistrantais kilo idėja padaryti konkursą. Užduotis yra tokia: reikia parašyti programą, kuri nuskaitytų iš duoto failo duomenis ir pateiktų tą patį, ką pateikia summary.lm. Magistrantai už tai gaus balus į egzaminą, na o visiems kitiems steigiu 100 Lt premiją laimėtojui. Praktinės vertės programa jokios neturės, nes pasaulyje yra pilna programų, kurios skaičiuoja tiesinę regresiją, bet užtat tai yra puiki proga pasitikrinti savo informatikos žinias.

Konkurso taisyklės:

1. Programa turi nuskaityti failą, kuriame yra lentelės tipo duomenys, eilutėse yra stebėjimai, stulpeliuose kintamieji. Pirmas stulpelis yra priklausomas kintamasis, visi kiti stulpeliai yra nepriklausomi kintamieji. Programa turi įvertinti tiesinės regresijos modelį ir išvesti atitinkamus rezultatus. Visą programos funkcionalumą nusako šis R kodas:

dt <- read.csv("data.csv")
colnames(dt)[1] <-"y"
print(summary(lm(y~.,data=dt)))


2. Programa turi būti programa, t.y. pasirinktos OS paleidžiamasis failas.
3. OS gali būti: Windows 7, Mac OS X 10.6, Ubuntu 10.10. Šiose OS bus tikrinama ar programa pasileidžia.
4. Programa gali turėti priklausomybių, t.y. kitų programų, kurių jai reikia sėkmingai veikti. Priklausomybės turi būti nemokamos, t.y. aš nesiruošiu pirkti papildomos programinės įrangos, tam kad galėčiau patikrinti programos veikimą.
5. Programa turės būti pateikta su išeities kodu.

Laimės originaliausia programa. Pirmenybė bus teikiama paprastumui, elgantiškumui ir daugiaplatformiškumui. Testiniai failai su duomenimis bus pateikti ateinančią savaitę. Konkursas baigiasi lapkričio 13 d. 13 val. Siųsti veikiančias programas galima ir anksčiau, kad aš spėčiau pranešti, ar programa veikia teisingai. Neveikiančios programos nebus vertinamos. Dalyvauti gali visi norintys, bet konkursas yra skirtas visų pirma ekonometrijos specialybės studentams, kurių informatikos žinios nėra ypatingai geros 🙂 Visi siunčiantys turės nurodyti savo vardą, pavardę ir ką studijuoja/studijavo. Informatikai tikriausiai bus automatiškai diskvalifikuojami, arba jiems bus atskiras prizas.

Klausti galima komentaruose, programas siųsti pašto adresu, kurį rasite mano oficialiame puslapyje. Konkurso taisyklės dar gali būti tikslinamos, bet apie tai bus pranešta šiame bloge.

Jeigu šalyje šeimos “daro“ vaikus tol kol gimsta pirmas berniukas, koks šalyje bus berniukų ir mergaičių pasiskirstymas? Pilnas atsakymas čia, šiek tiek daugiau background’o čia

Įdėjau papildomos informacijos apie MTRT. Taip pat nufotografavau pagrindinį prizą.

Šiais metais organizuoju MIDI renginį. Daugiau info rasite čia

Suppose you have following kind of time series data in the data.frame:

> dtm[1:15, ]
year month LPMVTUZ LPMVTVC LPMVTXC
1  1993     3    4829    4897  397808
2  1993     4    4496    4727  399425
3  1993     5    4997    4528  400922
4  1993     6    5444    4636  402512
5  1993     7    4899    4521  403940
6  1993     8    4636    4428  405716
7  1993     9    4548    4498  407348
8  1993    10    4687    4460  409435
9  1993    11    4419    4352  410844
10 1994     0    3410    4547  412513
11 1994     1    3787    4915  413913
12 1994     2    5180    4862  416117
13 1994     3    4729    5003  417992
14 1994     4    4871    4875  419575
15 1994     5    5323    4739  421933



The variables are in the columns and you have two columns for year and
month. Suppose you need to aggregate this data, i.e. get annual values
for the variables. It is possible to do this using loops, but here is
one line solution:

> library(reshape)
> recast(dtm, year ~ variable, fun.aggregate = sum, id.var = c("year",
+     "month"))
year LPMVTUZ LPMVTVC  LPMVTXC
1  1993   42955   41047  3637950
2  1994   57885   57949  5075675
3  1995   57285   57349  5392784
4  1996   71659   71249  5692667
5  1997   77228   77074  6079579
6  1998   89375   88936  6517787
7  1999  114708  113565  7081522
8  2000  119794  120147  7748582
9  2001  160123  159500  8480626
10 2002  220737  219763  9517652
11 2003  277342  277167 10794708
12 2004  291258  289876 12256691
13 2005  288280  287910 13543842
14 2006  345355  345648 14822438
15 2007  362632  362596 16232798
16 2008  253198  255364 17327344
17 2009  117936  116994 14586582



If you want to get average instead of sum, just use
fun.aggregate=mean.

This function actually combines two functions melt and cast. First one

> mdtm <- melt(dtm, id.var = c("year", "month"))
> mdtm[1:15, ]
year month variable value
1  1993     3  LPMVTUZ  4829
2  1993     4  LPMVTUZ  4496
3  1993     5  LPMVTUZ  4997
4  1993     6  LPMVTUZ  5444
5  1993     7  LPMVTUZ  4899
6  1993     8  LPMVTUZ  4636
7  1993     9  LPMVTUZ  4548
8  1993    10  LPMVTUZ  4687
9  1993    11  LPMVTUZ  4419
10 1994     0  LPMVTUZ  3410
11 1994     1  LPMVTUZ  3787
12 1994     2  LPMVTUZ  5180
13 1994     3  LPMVTUZ  4729
14 1994     4  LPMVTUZ  4871
15 1994     5  LPMVTUZ  5323



Note that all the names of the variables are now in the column named
variable, and the corresponding values are in the column surprisingly
named value. You can control the names of these columns by the way.

After melting the data.frame now we need to cast it:

> cdtm <- cast(mdtm, year ~ variable, fun.aggregate = sum)
> cdtm
year LPMVTUZ LPMVTVC  LPMVTXC
1  1993   42955   41047  3637950
2  1994   57885   57949  5075675
3  1995   57285   57349  5392784
4  1996   71659   71249  5692667
5  1997   77228   77074  6079579
6  1998   89375   88936  6517787
7  1999  114708  113565  7081522
8  2000  119794  120147  7748582
9  2001  160123  159500  8480626
10 2002  220737  219763  9517652
11 2003  277342  277167 10794708
12 2004  291258  289876 12256691
13 2005  288280  287910 13543842
14 2006  345355  345648 14822438
15 2007  362632  362596 16232798
16 2008  253198  255364 17327344
17 2009  117936  116994 14586582



Here we use formula to tell cast function, how to pick data. You can
read more details in the help. What is in the right hand side of the
formula will become columns, and the left hand side will be rows. Try
to exchange them, this is what you will get:

> cast(mdtm, variable ~ year, fun.aggregate = sum)
variable    1993    1994    1995    1996    1997    1998    1999    2000
1  LPMVTUZ   42955   57885   57285   71659   77228   89375  114708  119794
2  LPMVTVC   41047   57949   57349   71249   77074   88936  113565  120147
3  LPMVTXC 3637950 5075675 5392784 5692667 6079579 6517787 7081522 7748582
2001    2002     2003     2004     2005     2006     2007     2008
1  160123  220737   277342   291258   288280   345355   362632   253198
2  159500  219763   277167   289876   287910   345648   362596   255364
3 8480626 9517652 10794708 12256691 13543842 14822438 16232798 17327344
2009
1   117936
2   116994
3 14586582



We got the same result, just in transposed form.

package. Also look into plyr package by the same author. And if you

In previous post I asked if somebody knew how to get highlighted R code in wordpress. Somebody (yay, I have at least one reader) took this question too seriously, and found out the hard way how to do that. So for those who want to get something like this:

fun <- function(l, nm) {
l  <- x[[nm]
ss <- summary(l)
write.csv(ss,file=paste(nm,".csv",sep=""))
}

foreach(l=x,nm=names(x)) %do% fun(l,nm)

The solution involves Emacs, which in itself is quite intimidating. But as they say, no pain, no gain. After you pass that “I want to poke my eye with red hot poker just for fun“ phase, Emacs is quite nice. The drawback of course is, that this phase must be passed 🙂

So for a start, you must have Emacs installed. For Windows I recommend downloading Vincent Goulet’s prepared Emacs distribution. It comes with several useful packages preinstalled, such as ESS. For Ubuntu, your best bet would be to install emacs-snapshot, so you get nice looking Emacs. Then you will need to install ESS. For serious R programming, I think ESS is a must. I do not go into details, but is much more convenient than standard R window.

In addition you have to install htmlize and org-mode Emacs extensions. The org-mode may be included in your Emacs distribution. The instructions how to instal org-mode can be found in the website. To install htmlize, save the file htmlize.el to some directory, and byte-compile it from Emacs with command M-x byte-compile-file. Then include the directory into your Emacs load path, i.e. include this line at the top of your .emacs (or emacs) file:

(setq load-path (cons (expand-file-name "~/src/emacs-lisp/") load-path))

If you do custom org-mode install you may want include the following line also:

(require 'org-install)

To get highlighted R code, you will also need to add this code to your .emacs file:

(defun run-ascii-sweave-on-buffer ()
(ess-command "library(ascii)\n")
(ess-command (concat "Sweave(\"" (buffer-file-name)
"\", RweaveAsciidoc)\n")))

(defun htmlize-blog-entry ()
(interactive)
(save-excursion
(let*
(goto-char (point-min))
(basic-save-buffer)
(switch-to-buffer (org-export-region-as-html
(point-min) (point-max) t "*HTML*"))
(while (re-search-forward "<pre " nil t)
(replace-match
"<pre style=\"background-color:#FFFFE5; font-size:8pt;overflow:auto\" "
t nil))
(kill-ring-save (point-min) (point-max))
)))

(defun sweave-and-htmlize-blog-entry ()
(interactive)
(save-excursion
(run-ascii-sweave-on-buffer)
(let* ((name (buffer-file-name))
(txt-filename (concat name ".txt"))
(txt-buf (find-file txt-filename)))
(switch-to-buffer txt-buf)
(revert-buffer t t t)
(goto-char (point-min))
(while (re-search-forward "BEGIN_SRC Sweave" nil t)
(replace-match "BEGIN_SRC R-transcript" t nil))
(goto-char (point-min))
(while (re-search-forward "^----" nil t)
(replace-match "" nil nil))
(basic-save-buffer)
(switch-to-buffer (org-export-region-as-html
(point-min) (point-max) t "*HTML*"))
(kill-buffer txt-buf)
(while (re-search-forward "<pre " nil t)
(replace-match
"<pre style=\"background-color:#FFFFE5; font-size:8pt;overflow:auto\" "
t nil))
(kill-ring-save (point-min) (point-max)))))

(define-key global-map (kbd "<f11>") 'sweave-and-htmlize-blog-entry)
(define-key global-map (kbd "<f12>") 'htmlize-blog-entry)

I took those functions from the this blog, and made some modifications. You can copy paste it directly from here, or download all the code from here

To get the syntax highlighting you must write all the necessary material in the file with extension .org. Here is a sample file with some examples from this post. When you open this file in the Emacs, it should automatically open up org-mode (if it was installed properly). Now if you want some R code to syntax highlight, you write it inside these two statements:

#+BEGIN_SRC R
fun <- function(l, nm) {
l  <- x[[nm]
ss <- summary(l)
write.csv(ss,file=paste(nm,".csv",sep=""))
}
foreach(l=x,nm=names(x)) %do% fun(l,nm)
#+END_SRC R

Then if you are finished, just press f12, and Emacs will open you a new window with shiny HTML code. Copy this code to your blog, and that is it.

If you also want to include R code which gets executed by R, like this:

> x <- rnorm(100)
> summary(x)
Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
-2.08300 -0.67620  0.03544  0.06616  0.81680  2.34000

first you must start R process with M-x R. Then include in your .org code the code snippet like this (it helps if you know Sweave syntax):

#+BEGIN_SRC Sweave
<>=
x <- rnorm(100)
summary(x)
@
#+END_SRC

and then press f11.

The keyboard shortcuts can be changed if you want, just modify the the emacs code, you’ve added to your .emacs file. The code is more or less self-explanatory. By pressing f11 you invoke function sweave-and-htmlize-blog-entry, which runs
Sweave function from R on your text. The resulting file is exported to html by org-mode. By pressing f12 you invoke function htmlize-blog-entry, which exports to html, without invoking R. Since we use org-mode export, you can syntax highlight not only R code, but other code, which has support in Emacs. This means C, emacs-lisp, latex and others.

So here it is. If something did not work, try asking here, maybe I can help. I do not promise anything, I am no Emacs expert.

Special thanks goes to Raimondas, who spent two days to get a working example, with no previous Emacs knowledge. Buy him a beer, if you find this post useful. Also do not forget the guy who wrote the initial code, and the guys who created Emacs, org-mode, htmlize, ESS, R. Buy them a beer also 🙂

Frequently I find myself having to write something like this in R:

for (nm in names(x)) {
l <- x[[nm]]
ss <- summary(l)  # produce nice table for writing
write.csv(ss,file=paste(nm,".csv",sep=""))
}


Here x is some list, where each element contains some fitted model. This always bugs me, since I cannot use lapply, or other fancy functions like *lply (from package plyr). The problem is that I need to keep the name of the list element, and all of functions mentioned, do not do that. So if I have a generic function:

fun <- function(l, nm) {
l  <- x[[nm]]
ss <- summary(l)
write.csv(ss,file=paste(nm,".csv",sep=""))
}

I cannot use it with lapply like this

 lapply(x,fun)


since then I need to pass different nm for each list element.

With foreach however it is very easy to do:

foreach(l=x,nm=names(x)) %do% {
ss <-  summary(l)
write.csv(ss,file=paste(nm,".csv",sep=""))
}


Or I can use my super-convenient function:

foreach(l=x,nm=names(x)) %do% fun(l,nm)


You can even do some interesting stuff with parallelization, since foreach supports it. So if your computer has processor with several cores you can use both for intensive calculations. BTW if anyone knows easy way of highlighting R code in wordpress, let me know.

Suppose we have following regression models of quantity:

$\displaystyle \begin{array}{rcl} \ln Q &=& \alpha + \beta P,\\ \ln Q &=& \alpha + \beta\ln P, \\ \ln Q &=&\alpha + \beta\ln P + \gamma (\ln P)^2. \end{array}$

What happens to value ${V=Q\cdot P}$ when we change price? Since ${\ln V= \ln Q+\ln P}$ we get following value models:

$\displaystyle \begin{array}{rcl} \ln V &=& \alpha + \beta P + \ln P,\\ \ln V &=& \alpha + (\beta+1)\ln P, \\ \ln V &=&\alpha + (\beta+1)\ln P + \gamma (\ln P)^2. \end{array}$

Now we can dance the usual dance of adding ${\Delta}$ quantities in both sides of the equations:

$\displaystyle \begin{array}{rcl} \ln(X+\Delta X)=\ln X + \ln\left(1+ \frac{\Delta X}{X} \right)=\ln X+ \frac{\Delta X}{X}. \end{array}$

Thus we get:

$\displaystyle \begin{array}{rcl} \frac{\Delta V}{V} &=& \beta \Delta P + \frac{\Delta P}{P},\\ \frac{\Delta V}{V} &=& (\beta+1)\frac{\Delta P}{P} , \\ \frac{\Delta V}{V} &=& (\beta+1)\frac{\Delta P}{P} + 2\gamma \left(\frac{\Delta P}{P} \ln P\right) + \gamma \left(\frac{\Delta P}{P}\right)^2. \end{array}$

So the conclusions are:

1. If we have log-linear quantity-price model and the coefficient ${\beta}$ is negative, then if we increase the price, the value will diminish, unless ${\frac{\Delta P}{P}}$ dominates ${\Delta P}$. This can happen for small values of ${P}$.
2. If we have log-log quantity-price model, then we have a paradox. If ${\beta>-1}$, no matter how much we increase the price, the value will increase. In other words, loss of unit sales is compensated by the price increase. The dream of every seller.
3. If we have log-quadratic-log quantity-price model, the relationship is more complicated. But unless ${P=1}$, negative coefficient of ${\gamma}$ ensures the decrease of value in the event of unreasonable increase of price.

Atnaujinau Miktex bei lietuvybės jame diegimo instrukcijas. Jas rasite mano puslapyje. Jei kas nors neveikia, ar yra kokių netikslumų rašykite.

### Naujausi komentarai

 vzemlys apie Rožiniai akiniai Audrius apie Rožiniai akiniai Karl apie Time series data aggregation u… Vytautas Astrauskas apie Matematinio teksto rinkimo tur… Auksinis kardas apie Drawing national flags on maps…