Using trie to hold library of ascii values, rather than strings? - trie

I learned about tries in my class and I was just curious if it would be a good way to implement storing individual ascii characters, rather than strings.
For example, using a trie to be able to convert to any base.
Thanks

Related

How to write data to file in Kotlin

A little while ago, I started learning Kotlin, and I have done its basics, variables, classes, lists, and arrays, etc. but the book I was learning from seemed to miss one important aspect, reading and writing to a file, maybe a function like "fwrite" in C++
So I searched google, and yes, reading and writing bytes were easy enough. However, I being used to C++'s open personality, wanted to make a "kind of" database.
In C++ I would simply make a struct and keep appending it to a file, and then read all the stored objects one by one, by placing "fread" in a for loop or just reading into an array of the struct in one go, as the struct was simply just the bytes allocated to the variables inside it.
However in Kotlin, there is no struct, instead, we use Data Class to group data. I was hoping there was an equally easy way to store data in a file in form of Data Class and read it into maybe a List of that class, or if that is not possible, maybe some other way to store grouped data that would be easy to read and write.
Easiest way is to use a serialization library. Kotlin already provides something for that
TL;DR;
Add KotlinX Serialization to your project, choose the serialization format you prefer (protobuf or cbor will fit, go for json if you prefer something more human readable although bigger in size), use the proper serializer for generating your ByteArray and write it to a file using Kotlin methods for that
Generating the ByteArray might be tricky, not sure as I'm telling this from memory. What I can tell for sure is that if you choose JSON you can get the string representation and write to a file. So I'm assuming the same will be valid for binary formats (but writing to a file in binary instead of strings)
What you need can be fulfilled by ROOM DATABASE
It is officially recommended by GOOGLE, It uses your Android application's internal Database which is made using SQLITE
You can read more info about ROOM at
https://developer.android.com/jetpack/androidx/releases/room?gclid=Cj0KCQjw5ZSWBhCVARIsALERCvwjmJqiRPAnSYjhOzhPXg8dJEYnqKVgcRxSmHRKoyCpnWAPQIWD4gAaAlBnEALw_wcB&gclsrc=aw.ds
It provided Data Object Class (DAO) and Entity Classes through which one can access the database TABLE using SQL Queries.
Also, it will check your queries at compile time for any errors in it.
Note: You need to have basic SQL Knowledge for building the queries for CRUD Operations

Serialize unity3d C# objects to string and back

Which one of the two is recommended approach given my server API is expecting a C# string? Which one will result in lowest string length?
1) Protobuf-net
Using protobuf-net to convert object <-> byte array
Use Convert.ToBase64String methods for converting byte array <-> string
2) Use Json .Net directly to convert object <-> string
We have Protobuf-net working in our project with byte[] server APIs. Now our server is migrating to string APIs instead of byte[]. We are not sure whether we should move to Json .Net or stay with protobuf-net and use Convert Base 64 for extra string to byte[] conversion.
What do you suggest?
Okay, so this is my thought process which I'm hoping can help you decide between the two:
Before deciding which one is better we need to have a better grasp of the context of the problem. Optimization is always something that has to be done under well defined "fitness" parameters.
What I mean by this is:
If you're most constrained by CPU usage, I would test to see which code uses more CPU to execute.
If bandwidth is an issue, you'd want to look at the method that sends the smallest packets. (In which case base64 of binary serialization should be the answer.)
If code readability is a factor, you should probably look at which code is easier to read / understand while taking less text to write. (In which case I suspect that the JSON route will have better readability)
In general, I would caution against over-optimization. Mainly because you might spend more time thinking and comparing than would be lost by your "unoptimized" code :)
That is to say, only optimize when you can clearly define your bottle-neck.
Hope this helped :)

vb.net bigger than decimal data type

I am using VB.net and I want to make some cryptographic computations with keys length 1024bit (128bytes). I do not want to use a know algorithm thus I cannot use the Security library.
The biggest data type in vb.net is decimal (16bytes).
how can I do those computations? is there a different data type that I am not aware of?
You may like to check System.Numerics.BigInteger in .NET 4.0.
Also you may find it interesting to check Large Number Calculations in VB.NET
It would be instructive, even if you're not going to use them, to inspect the existing classes in the System.Security.Cryptography namespace.
You'll note that most of the methods that need to deal with keys, blocks, etc, are specced in terms of byte[]. A byte[] can be as big as you want/need it to be.
(Insert usual warnings about rolling your own crypto code)

Binary file & saved game formatting

I am working on a small roguelike game, and need some help with creating save games. I have tried several ways of saving games, but the load always fails, because I am not exactly sure what is a good way to mark the beginning of different sections for the player, entities, and the map.
What would be a good way of marking the beginning of each section, so that the data can read back reliably without knowing the length of each section?
Edit: The language is C++. It looks like a readable format would be a better shot. Thanks for all the quick replies.
The easiest solution is usually use a library to write the data using XML or INI, then compress it. This will be easier for you to parse, and result in smaller files than a custom binary format.
Of course, they will take slightly longer to load (though not much, unless your data files are 100's of MBs)
If you're determined to use a binary format, take a look at BER.
Are you really sure you need binary format?
Why not store in some text format so that it can be easily parseable, be it plain text, XML or YAML.
Since you're saving binary data you can't use markers without length.
Simply write the number of records of any type and then structured data, then it will be
easy to read again. If you have variable length elements like string the also need length information.
2
player record
player record
3
entities record
entities record
entities record
1
map
If you have a marker, you have to guarantee that the pattern doesn't exist elsewhere in your binary stream. If it does exist, you must use a special escape sequence to differentiate it. The Telnet protocol uses 0xFF to mark special commands that aren't part of the data stream. Whenever the data stream contains a naturally occurring 0xFF, then it must be replaced by 0xFFFF.
So you'd use a 2-byte marker to start a new section, like 0xFF01. If your reader sees 0xFF01, it's a new section. If it sees 0xFFFF, you'd collapse it into a single 0xFF. Naturally you can expand this approach to use any length marker you want.
(Although my personal preference is a text format (optionally compressed) or a binary format with length bytes instead of markers. I don't understand how you're serializing it without knowing when you're done reading a data structure.)

What's the canonical way to store arbitrary (possibly marked up) text in SQL?

What do wikis/stackoverflow/etc. do when it comes to storing text? Is the text broken at newlines? Is it broken into fixed-length chunks? How do you best store arbitrarily long chunks of text?
nvarchar(max) ftw. because over complicating simple things is bad, mmkay?
I guess if you need to offer the ability to store large chunks of text and you don't mind not being able to look into their content too much when querying, you can use CLobs.
This all depends on the RDBMS that you are using as well as the types of text that you are going to store. If the text is formatted into sizable chunks of data that mean something in and of themselves, like, say header/body, then you might want to break the data up into columns of these types. It may take multiple tables to use this method depending on the content that you are dealing with.
I don't know how other RDBMS's handle it, but I know that that it's not a good idea to have more than one open ended column in each table (text or varchar(max)). So you will want to make sure that only one column has unlimited characters.
Regarding PostgreSQL - use type TEXT or BYTEA. If you need to read random chunks you may consider large objects.
If you need to worry about keeping things like formatting strings, quotes, and other "cruft" in the text, as code would likely have, then the special characters need to be completely escaped first - otherwise on submission the db, they might end up causing an invalid command to be issued.
Most scripting languages have tools to do this built-in natively.
I guess it depends on where you want to store the text, if you need things like transactions etc.
Databases like SQL Server have a type that can store long text fields. In SQL Server 2005 this would primarily be nvarchar(max) for long unicode text strings. By using a database you can benefit from transactions and easy backup/restore assuming you are using the database for other things like StackOverflow.com does.
The alternative is to store text in files on disk. This may be fairly simple to implement and can work in environments where a database is not available or overkill.
Regards the format of the text that is stored in a database or file, it is probably very close to the input. If it's HTML then you would just push it through a function that would correctly escape it.
Something to remember is that you probably want to be using unicode or UTF-8 from creation to storage and vice-versa. This will allow you to support additional languages. Any problem with this encoding mechanism will corrupt your text. Historically people may have defaulted to ASCII based on the assumption they were saving disk space etc.
For SQL Server:
Use a varchar(max) to store. I think the upper limit is 2 GB.
Don't try to escape the text yourself. Pass the text through a parameterizing structure that will do the escapes properly for you. In .Net you'd add a parameter to a SqlCommand, or just use LinqToSQL (which then manages the SqlCommand for you).
I suspect StackOverflow is storing text in markdown format in arbitrarily-sized 'text' column. Maybe as UTF8 (but it might be UTF16 or something. I'm guessing it's SQL Server, which I don't know much about).
As a general rule you want to store stuff in your database in the 'rawest' form possible. That is, do all your decoding, and possibly cleaning, but don't do anything else with it (for example, if it's Markdown, don't encode it to HTML, leave it in its original 'raw' format)