thr3ads.net - R devel - [Rd] how useful could be a fast and embedded database for the R community? [Dec 2014]

If this information is useful, please help other people find it:
Share via:

joanv

2014-Dec-24 18:49 UTC

[Rd] how useful could be a fast and embedded database for the R community?

I've already done some benchmarks, again leveldb and sqlite, are quite slow
compared with my release.

They cannot be used very intensively in a huge computation, because the low
performance.

Yes, billions ( American ones : thousands of millions) with a generous RAM.

More details at: www.vulcandb.com

My main concern, is where can feet better such a database. In what fields,
or in what kind of calculations.

Regards and thank you




--
View this message in context:
http://r.789695.n4.nabble.com/how-useful-could-be-a-fast-and-embedded-database-for-the-R-community-tp4701051p4701075.html
Sent from the R devel mailing list archive at Nabble.com.

Dirk Eddelbuettel

2014-Dec-24 19:22 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

On 24 December 2014 at 10:49, joanv wrote:
| I've already done some benchmarks, again leveldb and sqlite, are quite
slow
| compared with my release.
| 
| They cannot be used very intensively in a huge computation, because the low
| performance.
| 
| Yes, billions ( American ones : thousands of millions) with a generous RAM.
| 
| More details at: www.vulcandb.com
| 
| My main concern, is where can feet better such a database. In what fields,
| or in what kind of calculations.

I second what Elijah said:  "working code". 

So far I see just a (very pretty) website, but stricly no code. Not good.

I work in a industry where we 
  a) do use billions of time series points and 
  b) are latency and performance sensitive

We like flat (binary) files as well as mmap a lot, and do a lot of C++ for
performance where we'd never ever dream about embedding R.  But we do course
use R for analysis and then embed quite some C++ into to "do stuff". 
We use
Redis (out of process) for a few things, but obviously not "raw data".
I like
what I see from influxdb.com, but we are now off-topic for this list.

So colour me "interested" -- but please show some code, or people will
tune
out pretty quickly.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

joanv

2014-Dec-24 19:37 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

I'm sorry, but I cannot show code. For the moment, this project is not open
source, it has costs a lot of effort, and first of all, I have to find a way
to recover the investment. If I find a way to recover the investment,
compatible with an open source way of business, the project will be open
source, but first of all, I have to find the "way". 

Please, you can check the benchmark of the first release. There are not a
lot of information about the benchmark, because as I said before, some
information is sensitive to be published, for the moment. 

Of course, I would like this project to be open source, but if has to find
"the way", and ideas are welcome!



--
View this message in context:
http://r.789695.n4.nabble.com/how-useful-could-be-a-fast-and-embedded-database-for-the-R-community-tp4701051p4701078.html
Sent from the R devel mailing list archive at Nabble.com.

joanv

2014-Dec-24 19:39 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

One more thing, the database embeddable, or it will be embeddable into R
code, but also, into C, C++ or Fortran.



--
View this message in context:
http://r.789695.n4.nabble.com/how-useful-could-be-a-fast-and-embedded-database-for-the-R-community-tp4701051p4701079.html
Sent from the R devel mailing list archive at Nabble.com.

joanv

2014-Dec-24 19:57 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

one more thing, I've never said the project is written in R, of course it is
not! But it can be used from R.



--
View this message in context:
http://r.789695.n4.nabble.com/how-useful-could-be-a-fast-and-embedded-database-for-the-R-community-tp4701051p4701081.html
Sent from the R devel mailing list archive at Nabble.com.

Barry Rowlingson

2014-Dec-25 08:49 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

On Wed, Dec 24, 2014 at 7:37 PM, joanv <joan.iglesias at live.com>
wrote:> I'm sorry, but I cannot show code.
 Then can you stop using the word "release". To release means to let
something go, preferably out into the wild. I can't even find a binary
"release" on that site. Call it the first "version" if you
want, but
not "release". I'm sure I'm not the only one wondering where
this
"release" is downloadable in some form.
> For the moment, this project is not open
> source, it has costs a lot of effort, and first of all, I have to find a
way
> to recover the investment. If I find a way to recover the investment,
> compatible with an open source way of business, the project will be open
> source, but first of all, I have to find the "way".
 You say the "Open Spartacus" project [http://www.openspartacus.org/]
from which VulcanDB came failed "Due to the lack of funding". What was
your personal relationship with that project? In what way, apart from
in name, was that "Open"? I can't find source code or binary
releases.
Just a fancy single-page website with *all* the buzzwords. Does this
failure not teach you anything?

 The R project and its leading lights are very proud of the open
nature of R, and so you will be talking to strong proponents of open
source software here. You've presented a project with no source or
binary release, no documentation, no API or specification, nothing. No
more than vapourware ever gives us. And then....
> Please, you can check the benchmark of the first release. There are not a
> lot of information about the benchmark, because as I said before, some
> information is sensitive to be published, for the moment.
 ... you ask *us* to check *your* benchmark? How? We have no idea
exactly what you tested, and benchmark comparisons *depend* on that.
> Of course, I would like this project to be open source, but if has to find
> "the way", and ideas are welcome!
 Release early, release often. The only way other R users are going to
be interested is to see the source, or at the very least to see the
proposed API and be able to discuss this. I think you will find few
friends here until you do. Otherwise I suggest you jazz up your
benchmarks into a pseudo-technical paper with some 3d bar graphs and
wave it under the noses of idiot venture capitalists until one of them
throws some money at you. Good luck!

Barry

Joan Iglesias

2014-Dec-25 11:43 UTC

head link

[Rd] how useful could be a fast and embedded database for the R community?

> > I'm sorry, but I cannot show code.
> 
>  Then can you stop using the word "release". To release means to
let
> something go, preferably out into the wild. I can't even find a binary
> "release" on that site. Call it the first "version" if
you want, but
> not "release". I'm sure I'm not the only one wondering
where this
> "release" is downloadable in some form.
I'm not a English speaker, maybe I did not use the word appropriately. 
> 
> > For the moment, this project is not open
> > source, it has costs a lot of effort, and first of all, I have to find
a way
> > to recover the investment. If I find a way to recover the investment,
> > compatible with an open source way of business, the project will be
open
> > source, but first of all, I have to find the "way".
> 
>  You say the "Open Spartacus" project
[http://www.openspartacus.org/]
> from which VulcanDB came failed "Due to the lack of funding".
What was
> your personal relationship with that project? In what way, apart from
> in name, was that "Open"? I can't find source code or binary
releases.
> Just a fancy single-page website with *all* the buzzwords. Does this
> failure not teach you anything?
The project it was too complex, and I couldn't find here in Spain close
collaborators or interests, or at least not enough. Anyway I'm not going to
tell the whole history, here is not the right place. And yes, my intention with
the other project was to be "open source", but not specially focused
in R.

And because I learned something with the failure, I take the best algorithms
developed for this project, and I made a project of "my size" and
"my resources", that it was what I learned.
> 
>  The R project and its leading lights are very proud of the open
> nature of R, and so you will be talking to strong proponents of open
> source software here. You've presented a project with no source or
> binary release, no documentation, no API or specification, nothing. No
> more than vapourware ever gives us. And then....
The version of the code I tested, it was only a version for benchmarking, with
the implementation of seek, insert, and join. If you know little bit about
key/value databases, you do not need much documentation to understand what do
insert and seek. Join, it not a usual function in a key/value database, and if
you have a better look, it is explained in more detail.

I cannot present anything, because the algorithms costed a lot of time to be
distilled, and for the moment I'm not going to publish it, unless until I
could be quite sure I can get some "profit" in some way. And the first
release or version is not usable for anybody, unless you want to do benchmarks.
> 
> > Please, you can check the benchmark of the first release. There are
not a
> > lot of information about the benchmark, because as I said before, some
> > information is sensitive to be published, for the moment.
> 
>  ... you ask *us* to check *your* benchmark? How? We have no idea
> exactly what you tested, and benchmark comparisons *depend* on that.
The numbers I presented talk by themselves, if you have tested, or checked other
benchmarks of other database. If not, of course no meaning. That was my mistake:
I'm used to the numbers of most of the databases, and maybe I supposed most
of people are used too.
> 
> > Of course, I would like this project to be open source, but if has to
find
> > "the way", and ideas are welcome!
> 
>  Release early, release often. The only way other R users are going to
> be interested is to see the source, or at the very least to see the
> proposed API and be able to discuss this. I think you will find few
> friends here until you do. Otherwise I suggest you jazz up your
> benchmarks into a pseudo-technical paper with some 3d bar graphs and
> wave it under the noses of idiot venture capitalists until one of them
> throws some money at you. Good luck!
right now, I do not need venture capital, because as I said before, it has a
size affordable by me. I only wanted to know in which fields it could be more
useful, and I do not discard this project to become open source, but it depends
on my economy, and the final way I'll find to make business. Right now my
economy is not good, and I have to be sure I'll get some profit of my work

As someone of the forum suggested me, it's not necessary to continue talking
about this topic I do not want to waste your time. I answered because I thought
it was necessary, due to the "tone" of the reply.
> 
> Barry 		 	   		  
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more apparently analagous threads

R devel - Dec 2014 - how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

[Rd] how useful could be a fast and embedded database for the R community?

Reasonably Related Threads