J.A. Bezemer
2001-Nov-22 15:07 UTC
[vorbis] [OT] Prior art & could use your help - Content distribution
[NOTE: This message may be regarded off-topic. Please reply privately only, not to the list.] Hi all! Over the past few months, I've been collaborating with a few people on subjects that fall in the category of web radio. (In case you don't know what that is: imagine a radio with built-in analog modem that receives and plays MP3 or rather Vorbis streams from the net.) Of course such a system would start quite small, but we're "designing" it to handle at least 100,000 receivers (radios) and 3,000 senders (radio stations) easily. This mail is NOT intended to discuss reasons, feasibility, or profitability. We don't know at all if we'll ever have the courage to turn theory into practice. However, during our discussions, several interesting ideas have surfaced that we couldn't find being published anywhere (but then, searching isn't very easy). So, these ideas could just be considered inventions (even though WE won't) and people could just submit patent applications for it, at least in the USA and probably also in Europe. We don't want that to happen, so we are publicly and widely disclosing our ideas. We had the intention to use the gratis facility at PriorArt.org, and if that was still in operation, there wouldn't have been any need to bother you. But PriorArt.org was closed down, and we don't have the US$100 that IP.com asks (apart from the question if they can be trusted). If we understand things correctly, patent law only recognizes a document as prior art (i.e. for _completely_ invalidating a patent claim) if it was a) posted publically and widely, and available/accessible to people reasonably skilled in the subject matter, and b) has a clear and unforgeable publication date. The first requirement should be met by posting this disclosure to several formus related to the subject; the second requirement is where you come in. TO HELP US IN THIS MATTER: you can easily, undeniably and unforgeably confirm that this disclosure is publically accessible today, by: 1) completing this simple form: Name : ______ Address/City/Country [handy but not required] : ______ I hereby confirm that the disclosure below was publically accessible on [date] : ______ and 2) mailing back this ENTIRE message (possibly quoted in any way), _GPG/PGP-SIGNED_, to this address: priorart@opensourcepartners.nl Note that you do not have to actually read the rest of this message, mailing back a _signed_ copy is enough. We will NOT publish anything we receive, only store it in a safe place to use it in the courts should the need arise. Thanks in advance for your help, Anne Bezemer ----------------------------------------------------------------------------- DISCLOSURE: METHOD FOR PIECEWISE DISTRIBUTION OF ANY LIVE/NON-LIVE CONTENT OVER DIGITAL (COMPUTER) NETWORKS, AND DECENTRALIZED SYSTEM TO CONTROL THAT DISTRIBUTION Streaming is hot. Streaming MP3 radio shows is hot. Streaming live webcasts is hot. Everyone wants streaming media. But streaming has one big problem. Every client needs one connection to the streaming server and all streamed data is transmitted to each individual client over its own connection. This means massive bandwidth usage at the server (which is very expensive) and requires a powerful server to handle all simultaneous connections. Multicasting could help (if it were widely implemented and reliable) but in our situation there is a much more interesting solution: "piecewise streaming". The basic idea of piecewise streaming is simple. At the source, the content stream(s) is/are cut into small parts of, for example, 15 to 30 seconds. The result is series of static files that can be distributed in any way traditionally used (or newly invented) for distributing static files, such as e-mail with multiple To: addresses on the same host, or multiple clients accessing one mailbox; or Freenet with its automatic caching/multiplication mechanisms. But the easiest way is just a webserver. Web space for limited use is cheap and often comes free with a dial-in account (that our "radios" need anyway, so we have plenty). And the second advantage is that practically all ISPs have a simple caching web proxy, which can easily cache our content when it is distributed in static parts from a webserver, without requiring any reconfiguration. Since we can ("should be able to") control all "radios" with a management system, it should be easy to have everyone that listens to one particular stream dial in to the same ISP and use the same caching web proxy. Some other advantages of this method: - Content is stored while transmitting, no separate storage server needed. - Content is stored and kept for a considerable time; this allows pausing and resuming of live broadcasts, possibly with some small overlap when restarting. (And pausing/resuming can also mean turning the "radio" completely off and on again.) - To get, say, 1 minute of content in the "radio"'s buffer, you don't have to wait one minute for the live content to stream in; you just start playing at live-minus-1-minute and the buffer will be filled at full modem speed. (Actually this is possible with traditional streaming too, but not widely used.) - With the easily-filled 1-minute buffer, it is possible to hang up and dial in to another ISP without the user noticing anything. - Error detection and error recovery is easy with per-piece checksums and re-downloading of corrupted pieces; there is enough time with a 1-minute buffer. - Switching servers is trivial; any piece of the content can be available from another server using another transport method (HTTP, FTP, whatever, either client- or server- or third-party-initiated). There can also be multiple ways to get any particular piece. - Works for any kind of content, there is no difference any more between static files (like text and pictures) and streams (like audio and video). Also works with non-public (for example encrypted, like RealAudio) content; the way to access it (for example decryption key) can be obtained via other means (for example a separate negotiation with an authentication server). - Client-side processing can be implemented relatively easy. For example a radio program only needs to consist of pointers to never-changing MP3/ Vorbis pieces and instructions for the client to perform cross-fading, sometimes adding in a little live piece or a news broadcast shared by several stations. (The MP3/Vorbis pieces can be hidden from the general public in any way conventionally used (or newly invented) to hide files.) - Content streams and content codings can be combined in any way the user wants and the "radio" is capable of. For example audio, video and subtitling (text) with no transmission of video and subtitling when the viewer window is minimized; or automatic switching between high- and low-quality codings depending on the user's activities, available CPU power and bandwidth, without the content server noticing anything. This distribution method needs some form of control (or management), if only to let the clients know what pieces in what codings are available where using which transmission method. Of course this can be done using any communication method, such as a (probably automatically updated) webpage, but that would not be very scalable or reliable. Instead, we are more interested in a decentralized and redundant system of "control servers" (as opposed to for example "content servers" or possibly "content translation servers" (for re-coding, mixing, whatever)). A structure like the Gnutella network is possible, in which each node (control server) would have connections to a constant (on average) number of other nodes. Clients only need to connect to one of the control servers to have access to all available control data. The data present in the entire control system can be organized in a database-like fashion with relatively small records; queries can be issued on the control server network and all servers having matching records will send a reply. Each control server should have a few backup servers somewhere in the network that are kept up to date on changes in (all, or a part of) that server's records, and that can take over the functionality if that server disappears somehow. This means data (records) has a high mobility and can be at any control server at any point in time. A consequence of high data mobility is that queries must have been processed by all nodes (possibly represented by their backups) in the network before conclusions can be drawn. The only way to be really certain of that, is not to wait a long time and just hope everyone has seen the query (not reliable and very user-unfriendly too), but to have everyone that has seen the query actually send a confirmation of that fact, in one way or another. This is where the standard Gnutella structure is less useful. There are two trivial network structures that possess the properties we need, namely the fully-connected network and the ring structure. The former has the advantage of very small and constant search times, but needs connections to every node, which does not scale well. The latter can be compared to a token ring network, in which each node only has 2 connections, but this has the disadvantage of potentially very long search times because the nodes search sequentially rather than in parallel. (In a unidirectional ring network, the "query seen" confirmation is implicit with the sender receiving back its own query; in a bidirectional network, the node in which the "collision" occurs can open a separate connection to the query-sender.) And of course a simple ring may have large problems with unreliable connections. We are currently investigating hybrid networks, combining several structures into one network, concentrating mostly on hybrid ring/fully- connected networks. These can be described either as a set of fully connected groups, interconnected by one or more rings; or as one or more rings, of which (some or all) nodes are members of (one or more) fully connected groups that interconnect the rings. The idea is that the number of rings and fully connected groups can be scaled adaptively to achieve both acceptable complexity and acceptable search time. ----------------------------------------------------------------------------- --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.
Arne Jonny Bakkevold
2001-Nov-22 23:14 UTC
[vorbis] [OT] Prior art & could use your help - Content distribution
At 00:07 23.11.01 +0100 J.A. Bezemer stirred the mighty wrath of Conan saying:>DISCLOSURE: METHOD FOR PIECEWISE DISTRIBUTION OF ANY LIVE/NON-LIVE > CONTENT OVER DIGITAL (COMPUTER) NETWORKS, AND DECENTRALIZED SYSTEM TO > CONTROL THAT DISTRIBUTION > >Streaming is hot. Streaming MP3 radio shows is hot. Streaming live >webcasts is hot. Everyone wants streaming media. > >But streaming has one big problem. Every client needs one connection to >the streaming server and all streamed data is transmitted to each >individual client over its own connection. This means massive bandwidth >usage at the server (which is very expensive) and requires a powerful >server to handle all simultaneous connections. > >Multicasting could help (if it were widely implemented and reliable) but >in our situation there is a much more interesting solution: "piecewise >streaming". > >The basic idea of piecewise streaming is simple. At the source, the >content stream(s) is/are cut into small parts of, for example, 15 to 30 >seconds. The result is series of static files that can be distributed in >any way traditionally used (or newly invented) for distributing static >files, such as e-mail with multiple To: addresses on the same host, or >multiple clients accessing one mailbox; or Freenet with its automatic >caching/multiplication mechanisms. > >But the easiest way is just a webserver. Web space for limited use is >cheap and often comes free with a dial-in account (that our "radios" need >anyway, so we have plenty). And the second advantage is that practically >all ISPs have a simple caching web proxy, which can easily cache our >content when it is distributed in static parts from a webserver, without >requiring any reconfiguration. Since we can ("should be able to") control >all "radios" with a management system, it should be easy to have everyone >that listens to one particular stream dial in to the same ISP and use the >same caching web proxy. > >Some other advantages of this method: >- Content is stored while transmitting, no separate storage server needed. >- Content is stored and kept for a considerable time; this allows pausing > and resuming of live broadcasts, possibly with some small overlap when > restarting. (And pausing/resuming can also mean turning the "radio" > completely off and on again.) >- To get, say, 1 minute of content in the "radio"'s buffer, you don't have > to wait one minute for the live content to stream in; you just start > playing at live-minus-1-minute and the buffer will be filled at full > modem speed. (Actually this is possible with traditional streaming too, > but not widely used.) >- With the easily-filled 1-minute buffer, it is possible to hang up and > dial in to another ISP without the user noticing anything. >- Error detection and error recovery is easy with per-piece checksums and > re-downloading of corrupted pieces; there is enough time with a 1-minute > buffer. >- Switching servers is trivial; any piece of the content can be available > from another server using another transport method (HTTP, FTP, whatever, > either client- or server- or third-party-initiated). There can also be > multiple ways to get any particular piece. >- Works for any kind of content, there is no difference any more between > static files (like text and pictures) and streams (like audio and > video). Also works with non-public (for example encrypted, like > RealAudio) content; the way to access it (for example decryption key) > can be obtained via other means (for example a separate negotiation with > an authentication server). >- Client-side processing can be implemented relatively easy. For example a > radio program only needs to consist of pointers to never-changing MP3/ > Vorbis pieces and instructions for the client to perform cross-fading, > sometimes adding in a little live piece or a news broadcast shared by > several stations. (The MP3/Vorbis pieces can be hidden from the general > public in any way conventionally used (or newly invented) to hide > files.) >- Content streams and content codings can be combined in any way the user > wants and the "radio" is capable of. For example audio, video and > subtitling (text) with no transmission of video and subtitling when the > viewer window is minimized; or automatic switching between high- and > low-quality codings depending on the user's activities, available CPU > power and bandwidth, without the content server noticing anything. > > >This distribution method needs some form of control (or management), if >only to let the clients know what pieces in what codings are available >where using which transmission method. Of course this can be done using >any communication method, such as a (probably automatically updated) >webpage, but that would not be very scalable or reliable. > >Instead, we are more interested in a decentralized and redundant system of >"control servers" (as opposed to for example "content servers" or possibly >"content translation servers" (for re-coding, mixing, whatever)). A >structure like the Gnutella network is possible, in which each node >(control server) would have connections to a constant (on average) number >of other nodes. Clients only need to connect to one of the control servers >to have access to all available control data. > >The data present in the entire control system can be organized in a >database-like fashion with relatively small records; queries can be issued >on the control server network and all servers having matching records will >send a reply. Each control server should have a few backup servers >somewhere in the network that are kept up to date on changes in (all, or a >part of) that server's records, and that can take over the functionality >if that server disappears somehow. This means data (records) has a high >mobility and can be at any control server at any point in time. > >A consequence of high data mobility is that queries must have been >processed by all nodes (possibly represented by their backups) in the >network before conclusions can be drawn. The only way to be really certain >of that, is not to wait a long time and just hope everyone has seen the >query (not reliable and very user-unfriendly too), but to have everyone >that has seen the query actually send a confirmation of that fact, in one >way or another. This is where the standard Gnutella structure is less >useful. > >There are two trivial network structures that possess the properties we >need, namely the fully-connected network and the ring structure. The >former has the advantage of very small and constant search times, but >needs connections to every node, which does not scale well. The latter can >be compared to a token ring network, in which each node only has 2 >connections, but this has the disadvantage of potentially very long search >times because the nodes search sequentially rather than in parallel. (In a >unidirectional ring network, the "query seen" confirmation is implicit >with the sender receiving back its own query; in a bidirectional network, >the node in which the "collision" occurs can open a separate connection to >the query-sender.) And of course a simple ring may have large problems >with unreliable connections. > >We are currently investigating hybrid networks, combining several >structures into one network, concentrating mostly on hybrid ring/fully- >connected networks. These can be described either as a set of fully >connected groups, interconnected by one or more rings; or as one or more >rings, of which (some or all) nodes are members of (one or more) fully >connected groups that interconnect the rings. The idea is that the number >of rings and fully connected groups can be scaled adaptively to achieve >both acceptable complexity and acceptable search time. > >-----------------------------------------------------------------------------Arne Jonny Bakkevold --- >8 ---- List archives: http://www.xiph.org/archives/ Ogg project homepage: http://www.xiph.org/ogg/ To unsubscribe from this list, send a message to 'vorbis-request@xiph.org' containing only the word 'unsubscribe' in the body. No subject is needed. Unsubscribe messages sent to the list will be ignored/filtered.