thr3ads.net - R devel - [Rd] Why does my RPy2 program run faster on Windows? [May 2010]

If this information is useful, please help other people find it:
Share via:

Abhijit Bera

2010-May-19 12:21 UTC

[Rd] Why does my RPy2 program run faster on Windows?

Hi

This is my function. It serves an HTML page after the calculations. I'm
connecting to a MSSQL DB using pyodbc.

    def CAPM(self,client):

        r=self.r

        cds="1590"
        bm="20559"

        d1 = []
        v1 = []
        v2 = []


        print"Parsing GET Params"

        params=client.g[1].split("&")

        for items in params:
            item=items.split("=")

            if(item[0]=="cds"):
                cds=unquote(item[1])
            elif(item[0]=="bm"):
                bm=unquote(item[1])

        print "cds: %s bm: %s" % (cds,bm)

        print "Fetching data"

        t3=datetime.now()

        for row in self.cursor.execute("select * from (select * from (
select co_code,dlyprice_date,dlyprice_close from feed_dlyprice P where
co_code in (%s,%s) ) DataTable PIVOT ( max(dlyprice_close) FOR co_code IN
([%s],[%s])  )PivotTable ) a order by dlyprice_date" %(cds,bm,cds,bm)):
            d1.append(str(row[0]))
            v1.append(row[1])
            v2.append(row[2])

        t4=datetime.now()

        t1=datetime.now()

        print "Calculating"

        d1.pop(0)
        d1vec = robjects.StrVector(d1)
        v1vec = robjects.FloatVector(v1)
        v2vec = robjects.FloatVector(v2)

        r1 = r('Return.calculate(%s)' %v1vec.r_repr())
        r2 = r('Return.calculate(%s)' %v2vec.r_repr())

        tl =
robjects.rlc.TaggedList([r1,r2],tags=('Geo','Nifty'))
        df = robjects.DataFrame(tl)

        ts2 = r.timeSeries(df,d1vec)
        tsa = r.timeSeries(r1,d1vec)
        tsb = r.timeSeries(r2,d1vec)

        robjects.globalenv["ta"] = tsa
        robjects.globalenv["tb"] = tsb
        robjects.globalenv["t2"] = ts2
        a = r('table.CAPM(ta,tb)')

        t2=datetime.now()


       
page="<html><title>CAPM</title><body>Result:<br>%s<br>Time
taken by
DB:%s<br>Time taken by R:%s<br>Total time
elapsed:%s<br></body></html>"
%(str(a),str(t4-t3),str(t2-t1),str(t2-t3))
        print "Serving page:"
        #print page

        self.serveResource(page,"text",client)



On Linux
Time taken by DB:0:00:00.024165
Time taken by R:0:00:05.572084
Total time elapsed:0:00:05.596288

On Windows
Time taken by DB:0:00:00.112000
Time taken by R:0:00:02.355000
Total time elapsed:0:00:02.467000

Why is there such a huge difference in the time taken by R on the two
platforms? Am I doing something wrong? It's my first Rpy2 code so I guess
it's badly written.

I'm loading the following libraries:
'PerformanceAnalytics','timeSeries','fPortfolio','fPortfolioBacktest'

I'm using Rpy2 2.1.0 and R 2.11

Regards

Abhijit Bera

	[[alternative HTML version deleted]]

Abhijit Bera

2010-May-19 13:04 UTC

head link

[Rd] Why does my RPy2 program run faster on Windows?

Update: it appears that the time taken isn't so much on the Data conversion.
The maximum time taken is in CAPM calculation. :( Anyone know why the CAPM
calculation would be faster on Windows?

On Wed, May 19, 2010 at 5:51 PM, Abhijit Bera <abhibera@gmail.com> wrote:
> Hi
>
> This is my function. It serves an HTML page after the calculations. I'm
> connecting to a MSSQL DB using pyodbc.
>
>     def CAPM(self,client):
>
>         r=self.r
>
>         cds="1590"
>         bm="20559"
>
>         d1 = []
>         v1 = []
>         v2 = []
>
>
>         print"Parsing GET Params"
>
>         params=client.g[1].split("&")
>
>         for items in params:
>             item=items.split("=")
>
>             if(item[0]=="cds"):
>                 cds=unquote(item[1])
>             elif(item[0]=="bm"):
>                 bm=unquote(item[1])
>
>         print "cds: %s bm: %s" % (cds,bm)
>
>         print "Fetching data"
>
>         t3=datetime.now()
>
>         for row in self.cursor.execute("select * from (select * from (
> select co_code,dlyprice_date,dlyprice_close from feed_dlyprice P where
> co_code in (%s,%s) ) DataTable PIVOT ( max(dlyprice_close) FOR co_code IN
> ([%s],[%s])  )PivotTable ) a order by dlyprice_date"
%(cds,bm,cds,bm)):
>             d1.append(str(row[0]))
>             v1.append(row[1])
>             v2.append(row[2])
>
>         t4=datetime.now()
>
>         t1=datetime.now()
>
>         print "Calculating"
>
>         d1.pop(0)
>         d1vec = robjects.StrVector(d1)
>         v1vec = robjects.FloatVector(v1)
>         v2vec = robjects.FloatVector(v2)
>
>         r1 = r('Return.calculate(%s)' %v1vec.r_repr())
>         r2 = r('Return.calculate(%s)' %v2vec.r_repr())
>
>         tl =
robjects.rlc.TaggedList([r1,r2],tags=('Geo','Nifty'))
>         df = robjects.DataFrame(tl)
>
>         ts2 = r.timeSeries(df,d1vec)
>         tsa = r.timeSeries(r1,d1vec)
>         tsb = r.timeSeries(r2,d1vec)
>
>         robjects.globalenv["ta"] = tsa
>         robjects.globalenv["tb"] = tsb
>         robjects.globalenv["t2"] = ts2
>         a = r('table.CAPM(ta,tb)')
>
>         t2=datetime.now()
>
>
>        
page="<html><title>CAPM</title><body>Result:<br>%s<br>Time
taken by
> DB:%s<br>Time taken by R:%s<br>Total time
elapsed:%s<br></body></html>"
> %(str(a),str(t4-t3),str(t2-t1),str(t2-t3))
>         print "Serving page:"
>         #print page
>
>         self.serveResource(page,"text",client)
>
>
>
> On Linux
> Time taken by DB:0:00:00.024165
> Time taken by R:0:00:05.572084
> Total time elapsed:0:00:05.596288
>
> On Windows
> Time taken by DB:0:00:00.112000
> Time taken by R:0:00:02.355000
> Total time elapsed:0:00:02.467000
>
> Why is there such a huge difference in the time taken by R on the two
> platforms? Am I doing something wrong? It's my first Rpy2 code so I
guess
> it's badly written.
>
> I'm loading the following libraries:
>
'PerformanceAnalytics','timeSeries','fPortfolio','fPortfolioBacktest'
>
> I'm using Rpy2 2.1.0 and R 2.11
>
> Regards
>
> Abhijit Bera
>
>
>
>
>
	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more apparently analagous threads

R devel - May 2010 - Why does my RPy2 program run faster on Windows?

[Rd] Why does my RPy2 program run faster on Windows?

[Rd] Why does my RPy2 program run faster on Windows?

Possibly Parallel Threads