> > > By default, when saving a URI using xmlSaveUri it escapes everything in > the URI. QEMU doesn't want anything escaped, so now I unescape > everything after the URI is generated. Unfortunately there's no flag to > change the default behavior. > > I'm not sure that's the actual issue here, but I'm somehow included to > think this is another consequence of the lack of query string handling > for http/https URIs. > > In any case, do you have a simple reproducer for this escaping handling > for qemu? >I don't have a simple repro for qemu, but this is pretty close. In guestfish do this: add "/vhds/osdiskforconsul0-osdisk.vhd?se=2016-01-01T00%3A00%3A00Z&sp=r&sv=2014-02-14&sr=b&sig=LOlrHXrQeaqlSEP51hRi7E5KDa9lnkqSvLTaZBmTkrQ%3D" readonly:true protocol:https server: gabhartswarmstorage.blob.core.windows.net After you execute 'run', you should see qemu complaining about something like a 404 Error and the file not existing. After the unescape change the above will start working.> > > The other problem was that only the "path" portion of the URI struct > was being used to indicate the path. That's natural enough, but that > practice is what was dropping the query string. The query string is kept > separately from the path. I now concat the query string back onto the URI > with a separating '?'. > > I don't think that appending the query string to the path is a good idea. > For example, we do a minimum of parsing of the URI, and for some > protocols (like the "We may have to adjust the path ..." comment says) > we adjust path according to the elements in the query string.Are you talking about this comment? /* We may have to adjust the path depending on the protocol. For * example ceph/rbd URIs look like rbd:///pool/disk, but the * exportname expected will be "pool/disk". Here, uri->path will be * "/pool/disk" so we have to knock off the leading '/' character. */ This is talking about removing leading the '/' in the path sometimes depending on protocol. Nothing to do with the query string.> > + if (asprintf(&path, "%s?%s", uri->path, uri->query_raw) == -1) { > > + perror ("asprintf: path + query_raw"); > > + free (*protocol_ret); > > + guestfs_int_free_string_list (*server_ret); > > + free (*username_ret); > > + free (*password_ret); > > + return -1; > > + } > > 'path' created here is leaked. Also, this needs to take into account > that either uri->path or uri->query_raw may be null. >Good point. Fixed. See attached patch.> > *path_ret = strdup (path ? path : ""); > > - if (*path_ret == NULL) { > > - perror ("strdup: path"); > > - free (*protocol_ret); > > - guestfs_int_free_string_list (*server_ret); > > - free (*username_ret); > > - free (*password_ret); > > - return -1; > > - } >Why did you remove the error checking?>I didn't, it was just moved. In any case see the attached fix.> Ad additional checking for URIs, we have fish/test-add-uri.sh, which > fails with your patch. You might want to also add additional checks > with for query string http/https URIs. >The test is now failing on this case: + guestfish -x -a 'nbd:///export?socket=/sk' + grep -sq 'add_drive *"/export"* "protocol:nbd" "server:unix:/sk"' test-add-uri.out because I am now appending the query string as in: add_drive* "/export?socket=/sk"* "protocol:nbd" "server:unix:/sk" In which cases should the query string be appended and in what cases should it be dropped? I believe in the http and https case it should be appended. I'm not familiar with the nbd protocol. Is dropping the query string always the right thing to do for this protocol? Other protocols? -- Gabriel
In data venerdì 26 giugno 2015 18:50:16, Gabriel Hartmann ha scritto:> > > > > By default, when saving a URI using xmlSaveUri it escapes everything in > > the URI. QEMU doesn't want anything escaped, so now I unescape > > everything after the URI is generated. Unfortunately there's no flag to > > change the default behavior. > > > > I'm not sure that's the actual issue here, but I'm somehow included to > > think this is another consequence of the lack of query string handling > > for http/https URIs. > > > > In any case, do you have a simple reproducer for this escaping handling > > for qemu? > > > > I don't have a simple repro for qemu, but this is pretty close. In > guestfish do this: > > add > "/vhds/osdiskforconsul0-osdisk.vhd?se=2016-01-01T00%3A00%3A00Z&sp=r&sv=2014-02-14&sr=b&sig=LOlrHXrQeaqlSEP51hRi7E5KDa9lnkqSvLTaZBmTkrQ%3D" > readonly:true protocol:https server: > gabhartswarmstorage.blob.core.windows.net > > After you execute 'run', you should see qemu complaining about something > like a 404 Error and the file not existing. After the unescape change the > above will start working.In this case, you can check in a simplier way: $ guestfish -x -a '/vhds/etc...' you'll see a "libguestfs: trace: add_drive ..." line contaning the actual add_drive call which guestfish will do to the library, and the parameters passed to it. Also, I'm thinking we should focus on making sure to have the proper query string for http(s) URIs to add_drive, we'll deal with the percent-encoding on the way.> > > The other problem was that only the "path" portion of the URI struct > > was being used to indicate the path. That's natural enough, but that > > practice is what was dropping the query string. The query string is kept > > separately from the path. I now concat the query string back onto the URI > > with a separating '?'. > > > > I don't think that appending the query string to the path is a good idea. > > For example, we do a minimum of parsing of the URI, and for some > > protocols (like the "We may have to adjust the path ..." comment says) > > we adjust path according to the elements in the query string. > > > Are you talking about this comment? > /* We may have to adjust the path depending on the protocol. For > * example ceph/rbd URIs look like rbd:///pool/disk, but the > * exportname expected will be "pool/disk". Here, uri->path will be > * "/pool/disk" so we have to knock off the leading '/' character. > */ > > This is talking about removing leading the '/' in the path sometimes > depending on protocol. Nothing to do with the query string.Yes, although it was not directly to the specified code snippet. The advice here is that some protocols don't need a query string in the path, because parts of it might be handled differently. See also below.> > Ad additional checking for URIs, we have fish/test-add-uri.sh, which > > fails with your patch. You might want to also add additional checks > > with for query string http/https URIs. > > > > The test is now failing on this case: > + guestfish -x -a 'nbd:///export?socket=/sk' > + grep -sq 'add_drive *"/export"* "protocol:nbd" "server:unix:/sk"' > test-add-uri.out > > because I am now appending the query string as in: > add_drive* "/export?socket=/sk"* "protocol:nbd" "server:unix:/sk"It didn't expect a query string before, and it must not take it now, as the parameters are not related to the actual path, but they are connection parameters.> In which cases should the query string be appended and in what cases should > it be dropped? I believe in the http and https case it should be > appended. I'm not familiar with the nbd protocol. Is dropping the query > string always the right thing to do for this protocol? Other protocols?Surely http(s) drives need to have the query string that was specified in guestfish URIs, just like the add_drive API need to properly deal with such extra parameter, and manage them depending on the protocol. We have few possibilities at this point, in order of complexity: (a) split the query string into an "hash table" (i.e. a C list of key/value items), pass that as new parameter to add_drive, and rebuild a query string depending on the backend (direct/qemu, libvirt, etc) and on the protocol (http, nbd, etc) (b) pass the query string as-is (percent-encoded or not) to add_drive, and add it back depending on the protocol (c) join the query string to the path, depending on the protocol, in guestfish Your solution so far is (c), while IMHO a better one would be (b) or even (a). Also, something more to check is how e.g. libvirt needs such kind of URI parameter after the path, and how. You can check whether your change also work with libvirt by exporting LIBGUESTFS_BACKEND=libvirt (or also using "set-backend libvirt" in guestfish), provided you have built libguestfs with libvirt support. Thanks, -- Pino Toscano
(apologies if you see this twice, forgot the CC the first time) In data venerdì 26 giugno 2015 18:50:16, Gabriel Hartmann ha scritto:> > > > > By default, when saving a URI using xmlSaveUri it escapes everything in > > the URI. QEMU doesn't want anything escaped, so now I unescape > > everything after the URI is generated. Unfortunately there's no flag to > > change the default behavior. > > > > I'm not sure that's the actual issue here, but I'm somehow included to > > think this is another consequence of the lack of query string handling > > for http/https URIs. > > > > In any case, do you have a simple reproducer for this escaping handling > > for qemu? > > > > I don't have a simple repro for qemu, but this is pretty close. In > guestfish do this: > > add > "/vhds/osdiskforconsul0-osdisk.vhd?se=2016-01-01T00%3A00%3A00Z&sp=r&sv=2014-02-14&sr=b&sig=LOlrHXrQeaqlSEP51hRi7E5KDa9lnkqSvLTaZBmTkrQ%3D" > readonly:true protocol:https server: > gabhartswarmstorage.blob.core.windows.net > > After you execute 'run', you should see qemu complaining about something > like a 404 Error and the file not existing. After the unescape change the > above will start working.In this case, you can check in a simplier way: $ guestfish -x -a '/vhds/etc...' you'll see a "libguestfs: trace: add_drive ..." line contaning the actual add_drive call which guestfish will do to the library, and the parameters passed to it. Also, I'm thinking we should focus on making sure to have the proper query string for http(s) URIs to add_drive, we'll deal with the percent-encoding on the way.> > > The other problem was that only the "path" portion of the URI struct > > was being used to indicate the path. That's natural enough, but that > > practice is what was dropping the query string. The query string is kept > > separately from the path. I now concat the query string back onto the URI > > with a separating '?'. > > > > I don't think that appending the query string to the path is a good idea. > > For example, we do a minimum of parsing of the URI, and for some > > protocols (like the "We may have to adjust the path ..." comment says) > > we adjust path according to the elements in the query string. > > > Are you talking about this comment? > /* We may have to adjust the path depending on the protocol. For > * example ceph/rbd URIs look like rbd:///pool/disk, but the > * exportname expected will be "pool/disk". Here, uri->path will be > * "/pool/disk" so we have to knock off the leading '/' character. > */ > > This is talking about removing leading the '/' in the path sometimes > depending on protocol. Nothing to do with the query string.Yes, although it was not directly to the specified code snippet. The advice here is that some protocols don't need a query string in the path, because parts of it might be handled differently. See also below.> > Ad additional checking for URIs, we have fish/test-add-uri.sh, which > > fails with your patch. You might want to also add additional checks > > with for query string http/https URIs. > > > > The test is now failing on this case: > + guestfish -x -a 'nbd:///export?socket=/sk' > + grep -sq 'add_drive *"/export"* "protocol:nbd" "server:unix:/sk"' > test-add-uri.out > > because I am now appending the query string as in: > add_drive* "/export?socket=/sk"* "protocol:nbd" "server:unix:/sk"It didn't expect a query string before, and it must not take it now, as the parameters are not related to the actual path, but they are connection parameters.> In which cases should the query string be appended and in what cases should > it be dropped? I believe in the http and https case it should be > appended. I'm not familiar with the nbd protocol. Is dropping the query > string always the right thing to do for this protocol? Other protocols?Surely http(s) drives need to have the query string that was specified in guestfish URIs, just like the add_drive API need to properly deal with such extra parameter, and manage them depending on the protocol. We have few possibilities at this point, in order of complexity: (a) split the query string into an "hash table" (i.e. a C list of key/value items), pass that as new parameter to add_drive, and rebuild a query string depending on the backend (direct/qemu, libvirt, etc) and on the protocol (http, nbd, etc) (b) pass the query string as-is (percent-encoded or not) to add_drive, and add it back depending on the protocol (c) join the query string to the path, depending on the protocol, in guestfish Your solution so far is (c), while IMHO a better one would be (b) or even (a). Also, something more to check is how e.g. libvirt needs such kind of URI parameter after the path, and how. You can check whether your change also work with libvirt by exporting LIBGUESTFS_BACKEND=libvirt (or also using "set-backend libvirt" in guestfish), provided you have built libguestfs with libvirt support. Thanks, -- Pino Toscano
Re-adding Snesha: +sneshaf@microsoft.com <sneshaf@microsoft.com> Hi Pino, Thanks, this looks like a good list of alternatives. I think (b) is probably the way to go. I'm willing to just add the query string to the path in only the HTTP and HTTPS cases for now. If other protocols run into some problems around lacking query strings we can probably re-visit the issue. The (a) option seems needlessly complex and bug prone to me. -- Gabriel> We have few possibilities at this point, in order of complexity: > > (a) split the query string into an "hash table" (i.e. a C list of > key/value items), pass that as new parameter to add_drive, and rebuild > a query string depending on the backend (direct/qemu, libvirt, etc) and > on the protocol (http, nbd, etc) > > (b) pass the query string as-is (percent-encoded or not) to add_drive, > and add it back depending on the protocol > > (c) join the query string to the path, depending on the protocol, in > guestfish > > Your solution so far is (c), while IMHO a better one would be (b) or > even (a). > > Also, something more to check is how e.g. libvirt needs such kind of > URI parameter after the path, and how. You can check whether your > change also work with libvirt by exporting LIBGUESTFS_BACKEND=libvirt > (or also using "set-backend libvirt" in guestfish), provided you have > built libguestfs with libvirt support. >