Justin Bogner
2014-Mar-24 03:02 UTC
[LLVMdev] [RFC] Moving OnDiskHashTable from clang to LLVM
Rafael Espíndola <rafael.espindola at gmail.com> writes:>> They're obviously related. Endian.h defines types like ulittle32, which >> DTRT when read and written from memory, whereas clang::io has functions >> to read and write memory like so: >> >> void Emit32(raw_ostream& Out, uint32_t V); >> uint32_t ReadLE32(const unsigned char *&Data); >> uint32_t ReadUnalignedLE32(const unsigned char *&Data); >> >> The (aligned) ReadLE32 is easily represented by the stuff in Endian.h, >> but the write to an ostream and reading unaligned aren't addressed very >> well at present. > > ulittle32 is actually unaligned: > > typedef detail::packed_endian_specific_integral > <uint32_t, little, unaligned> ulittle32_t > > But yes, I don't think we have anything for writing and I it is > probably not trivial to extend it. We can probably add the existing > write functions to Endian.h itself.A bit more detail on this. For reading, simply changing the existing clients to use Endian.h is a bit of work, as the read functions in clang::io actually move a cursor over the read data, so simply switching would actually be something like this: - unsigned len = ReadUnalignedLE16(Items); + unsigned len = *reinterpret_cast<llvm::support::ulittle16_t *>(Items); + Items += 2; Presumably, hiding the addition helps maintainability. I suppose we could add functions functions that increment the buffer to Endian.h, and maybe expose an API like so: + unsigned len = support::readFromBuffer<support::ulittle16_t>(Items) For writing, what about adding this to raw_ostream directly? Something like: template <typename T> raw_ostream &write_le(T V) { for (size_t I = 0, E = sizeof(T); I < E; ++I) *this << (unsigned char)(0xFF & (V >> (I * 8))); return *this; } template <typename T> raw_ostream &write_be(T V) { for (size_t I = sizeof(T); I; --I) *this << (unsigned char)(0xFF & (V >> (I * 8))); return *this; } Thoughts?
Rafael Espíndola
2014-Mar-24 03:12 UTC
[LLVMdev] [RFC] Moving OnDiskHashTable from clang to LLVM
> A bit more detail on this. > > For reading, simply changing the existing clients to use Endian.h is a > bit of work, as the read functions in clang::io actually move a cursor > over the read data, so simply switching would actually be something like > this: > > - unsigned len = ReadUnalignedLE16(Items); > + unsigned len = *reinterpret_cast<llvm::support::ulittle16_t *>(Items); > + Items += 2; > > Presumably, hiding the addition helps maintainability. I suppose we > could add functions functions that increment the buffer to Endian.h, and > maybe expose an API like so: > > + unsigned len = support::readFromBuffer<support::ulittle16_t>(Items) > > > For writing, what about adding this to raw_ostream directly? Something > like: > > template <typename T> raw_ostream &write_le(T V) { > for (size_t I = 0, E = sizeof(T); I < E; ++I) > *this << (unsigned char)(0xFF & (V >> (I * 8))); > return *this; > } > > template <typename T> raw_ostream &write_be(T V) { > for (size_t I = sizeof(T); I; --I) > *this << (unsigned char)(0xFF & (V >> (I * 8))); > return *this; > } > > > Thoughts?I think I see your point. The API seem sufficiently different that there might be use cases where one or the other is better. Using an explicit increment would make the code less readable. The API in Endian.h is nice for cases where we know the entire layout of large data section, since they are composable: struct dos_header { support::ulittle16_t Magic; support::ulittle16_t UsedBytesInTheLastPage; ... }; If the use cases we have in clang or the ones you want to add don't match this, then regular stream api is probably better. I would keep it out of raw_ostream, at least for now. How a about a include/llvm/Support/EndianStream.h? Cheers, Rafael
Justin Bogner
2014-Mar-24 21:31 UTC
[LLVMdev] [RFC] Moving OnDiskHashTable from clang to LLVM
Rafael Espíndola <rafael.espindola at gmail.com> writes:> I think I see your point. The API seem sufficiently different that > there might be use cases where one or the other is better. Using an > explicit increment would make the code less readable. The API in > Endian.h is nice for cases where we know the entire layout of large > data section, since they are composable: > > struct dos_header { > support::ulittle16_t Magic; > support::ulittle16_t UsedBytesInTheLastPage; > ... > }; > > If the use cases we have in clang or the ones you want to add don't > match this, then regular stream api is probably better. I would keep > it out of raw_ostream, at least for now. How a about a > include/llvm/Support/EndianStream.h?I've sent two patches to the list, one which introduces a stream writer in EndianStream.h, and one that adds a readNext function to Endian.h: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140324/210057.html http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140324/210061.html How do these sound?