I have the following data, (All the data will be TEXT).
The following data is for Language files for multi-language translation.
data: {
'MAIN_KEY':{
'A_UNIQUE_KEY':'data1',
'A_UNIQUE_KEY':'data2',
'A_UNIQUE_KEY':'data3',
},
'MAIN_KEY':{
'A_UNIQUE_KEY':'data1',
'A_UNIQUE_KEY':'data2',
'A_UNIQUE_KEY':'data3',
'A_UNIQUE_KEY':'data2',
'A_UNIQUE_KEY':'data3',
},
......
}
Here, the MAIN_KEY will be different for each set. In this case component name, Eg: LOGIN_PAGE
A_UNIQUE_KEY will also different in each case. In this case field name. Eg: USER_NAME and so on.
The number of Key-value pairs in each set will also be different.
I need to store the data for multiple languages. But the MAIN_KEY and A_UNIQUE_KEY will be the same for every file.
Every file will be of the same structure but only the data1, data2.. will be different,
What I want to achieve is that to store and manage the data in this format and later generate a JSON file through an API for my different applications. I should be able to do CRUD operations on this data.
Is creating a Database is the only option here?
XML and JSON are hierarchical structures like a tree, where a parent node can have child nodes and each child node has exactly one parent node, whereas a database is typically a relational system where an entity can have many parents, siblings and children so-to-say :-)
This means in order to store your JSON you have at least these three options:
Create a simple database in an RDBMS where each child table has just one parent table.
Have just one table in an RDBMS and store your JSON in a row in that table. (After all the JSON is just a string.)
Use a non-relational DBMS. There even exist JSON DBMS, if I'm not mistaken.
Which option you choose would depend on your data and what you want to do with it.
I ended up using DynamoDB.
With DynamoDB I can store the data in One Table using the Composite Key. And use DynamoDBMapper to do the CRUD operations.
Related
I'm trying to figure out the best way to store graph data structures in an SQL database. After some research, it seems that I can store graph Nodes in a table and just create a join table with the many-to-many relationships between them which would represent the edges (or connections). That seems exactly what I was looking for, but now I want to introduce the users who own the nodes.
From the performance point of view, would it make sense to create a new join table userNodes, or just save users as nodes assuming that node is a generic structure? And what are the implications of storing everything in a single table?
If you have individual attributes that should be stored on a per-node level, then those attributes should be in the nodes table. That is what the table is for.
If the attributes are really a list, then you would want another table. For instance, if multiple users could own a node, then one option would be a userNodes table. However, as you describe the data, there is only one user per node.
TL;DR:
I want to use a non-relational design to store a tree of nodes in a self-referencing table because we will never need to relationally select subsets of data. This allows for extremely simple recursive storage and retrieval functions.
Coworker wants to use a relational design to store each specific field of the object -- I assume because he believes relational is simply always better. (he doesn't have any specific reasons) This would require more tables and more complex storage and retrieval functions, and I don't think it would serve to benefit us in any way.
Is there any specific benefits or pitfalls to either of the design methods?
How are trees normally stored in databases? Self referencing tables?
Are there any known samples of trees of data being stored in databases that might coincide with the task we are trying to solve?
At work we are using a complex structure to describe an object, unfortunately I cannot share the exact structure because of work restrictions but I will give an equivalent example of the structure and explain the features of it.
The structure can be represented in json but actually conforms to a much tighter syntax restriction.
There is four kinds of entities in the structure:
top level node
This node is a json object and it must be the top level json object
This node must contain exactly 4 attributes (meta info 1 through 4)
This node must contain exactly 1 'main' container node
container nodes
These are json objects that contain other containers and pattern nodes
Must contain exactly 1 attribute named 'container_attribute'
May contain any number of other containers and patterns
pattern nodes
These are json objects that contain exactly 3 attributes
A pattern is technically a container
May not contain anything else
attribute nodes
These are just json string objects
The top level container is always a json object that contains 4 attributes and exactly 1 container called 'main_container'
All containers must contain a single attribute called 'container_attribute'.
All patterns must contain exactly three attributes
An example of a structure in json looks like the following:
{
"top_level_node": {
"meta_info_1": "meta_info_keyword1",
"meta_info_2": "meta_info_keyword2",
"meta_info_3": "meta_info_keyword3",
"meta_info_4": "unique string of data",
"main_container": {
"container_attribute": "container_attribute_keyword",
"sub_container_1": {
"container_attribute": "container_attribute_keyword",
"pattern_1": {
"pattern_property_1": "pattern_property_1_keyword",
"pattern_property_2": "pattern_property_2_keyword",
"pattern_property_3": "unique string of data"
},
"pattern_2": {
"pattern_property_1": "pattern_property_1_keyword",
"pattern_property_2": "pattern_property_2_keyword",
"pattern_property_3": "unique string of data"
}
},
"pattern_3": {
"pattern_property_1": "pattern_property_1_keyword",
"pattern_property_2": "pattern_property_2_keyword",
"pattern_property_3": "unique string of data"
}
}
}
}
We want to store this structure in our internal office database and I am suggesting that we use three tables, one to store all json objects in a self-referencing table and one to store all json strings in a table that references the json object table, and then a third table to tie the top level containers to an object name.
The schema would look something like this:
Where an attibutes table would be used to store everything that is a json string with references to parent container id:
CREATE TABLE attributes (
id int DEFAULT nextval('attributes_id_seq'::text),
name varchar(255),
container_id int,
type int,
value_type int,
value varchar(255)
);
The containers table would be used to store all containers in a self-referencing table to create the 'tree' structure:
CREATE TABLE containers (
id int DEFAULT nextval('containers_id_seq'::text),
parent_container_id int
);
And then a single list of object names that point to the top level container id for the object:
CREATE TABLE object_names (
id int DEFAULT nextval('object_names_id_seq'::text),
name varchar(255),
container_id int
);
The nice thing about the above structure is it makes for a really simple recursive function to iterate the tree and store attributes and containers.
The downside is it's not relational whatsoever and therefore doesn't help to perform complex relational queries to retrieve sets of information.
The reason I say we should use this is because we have absolutely no reason to select pieces of these objects in a relational manner, the data on each object is only useful in the context of that object and we do not have any situations where we will need to select this data for any reason except rebuilding the object.
However my coworker is saying that we should be using a relational database design to store this, and that each of the 'keyword' attributes should have it's own table (container keyword table, 3 pattern keyword tables, 4 top level keyword tables).
The result is storing these objects in the suggested relational design becomes significantly more complex and requires many more tables.
Note that query speed/efficiency is not an issue because this object/database is for internal use for purposes that are not time-sensitive at all. Ultimately all we are doing with this is creating new 'objects' and storing them and then later querying the database to rebuild all objects.
If there is no benefit to a relational database design then is there any reason to use it over something that allows for such a simple storage/retrieval API?
Is there any significant issues with my suggested schema?
"we will never need to X" is a rather bold assumption that turns out to be unwarranted more often than you might suspect. And in fact with tree structures in particular, it is most natural for the requirement to arise to "zoom into a node" and treat that as a tree in its own right for a short time.
EDIT
And in case it wasn't clear why that matters : relational aproaches tend to offer more flexibility because such flexibility is built into the data structure. Non-relational approaches (typically implying that everything is solved in code) tend to lead to additional rounds of codeshitting once requirements start to evolve.
I am new to databases and sql and would like to design a database for a fitness app that will keep track of workouts at the gym.
In my app, I have designed a custom workout object that has a name (e.g. 'Chest day'), an ID (some number) and a date (string). Each workout object contains an array of exercises, another custom object, that has a property for called 'set'. The set is also a custom object with only two numeric properties: number of reps and weight (e.g. 10 reps at 50 lbs)
What I thought of is to have one table for the workouts, another for the exercises and another for the sets. The problem is I do not know how to connect the tables (i.e. link multiple exercises to a unique workout and link multiple sets to a unique exercise) and am not sure if this is even the correct approach.
Also, I planned to set up the backend for this app using the amazon web services mobile hub which provides a noSQL database.
In NoSQL, you should keep all the attributes in single table. You shouldn't normalize the data like RDBMS. Also, please try to come away from Join. The main advantage of NoSQL is that keep everything as one item, so that you don't need to Join to get the result.
Advantages of this approach are:-
1) Fast response as all the data is present as one item in a table
2) Schema less database i.e. you can add any attributes at any time (i.e. no need to alter table and add the new columns)
DynamoDB design for above use case:-
The combination of partition and sort key should be unique
name -String (Partition Key)
id -Number (Sort Key)
date - String
exercise : [array of values] - List data type
custom_set : {rep : 1, weight : 2} - Map data type
Important Note:-
The important thing while designing the data model for DynamoDB is all the data retrieval use cases (i.e. Query Access Patterns) should be available to design the appropriate model.
I'm building a database in sqlite with multiple tables. It will work like a tag based search where CARS will be compared based on how many TAGS match between them. There will also be one layer used to categorize items called MANUFACTURER. So a typical use case would be user selects MANUFACTURER1 (lets say Ford) as an input and MANUFACTURER2 (lets say Toyota) as an output, enters a CAR [database compares TAGS to CARS between the two MANUFACTURERS] and fectches a CAR recommendation of MANUFACTURER2. I am using Core Data with entities of each, but this does not involve newly created objects, just what's in the original sql database.
My question is - is it better to generate the search with SQLite code, or NSPredicate/NSCompoundPredicate? Are there performance differences?
If you use Core Data with a SQlite store, the NSFetchRequest with a specific predicate will be resolved at the sql level, so you don't need to add nothing to it.
Core Data will abstract this for you. If you use Core Data you cannot use your own query. Just stick with NSFetchRequests and NSPredicates.
Maybe what you need it's to import the db you have in the actual Core Data store.
Maybe I cannot understand your question but what's your goal?
I am using ZF2 and Doctrine to develop a project that manages calendar events. All of the events share a common set of data elements, but unique types of events share their own unique sets of type-specific data elements. For example, all of the events include common elements such as eventID, eventName and eventDate. In addition to those common elements, events that are “meetings” will have additional elements like agenda, minutes or attendees that are specific to meetings, events that are “training” will have additional elements that are specific to training, events that are “conferences” will have additional unique elements, and so on.
The project’s index view will want to be able to list all events, but will not require any data beyond the “common” data set. If an event belongs to a certain type, it WILL NOT also belong to another type; so each event will only be associated with one supplemental data set. Some event types might have more than one router: the router sales/meetings[/action] will want to access the same or identical entities, fieldsets and forms as the router marketing/meetings[/action].
In my database I will have a table named events to index all of the events and record common data, and I’ll have a collection of associated tables like events_meetings, events_training, events_conferences, and so on to record type-specific data.
I’m contemplating a number of different solutions for the project:
Solution one: a single module, a single events entity and fieldset, and entities and fieldsets for each of the type-specific data element groups. This solution would require that the events entity have multiple OneToMany elements: one for each of the associated data sets. I don’t know whether this is possible; and even if it is possible, I don’t know whether it is a good idea.
Solution two: a single module and duplicate copies of the common “events” entity and fieldset: events_1 is linked to the events table and is associated with the events_meetings entity; events_2 is identical to events_1 except for the OneToMany element and it is associated with the events_training entity; and events_3, events_4, events_5, and so on are each associated with their own supplemental data set entities. I can see this working, but it requires a lot of nearly identical copies of the common data entity.
Solution three: multiple modules, each with a single events entity and fieldset, and a single associated events_foo entity and fieldset. This is perhaps the cleanest solution, although it seems to create a lot of identical code.
Solution four: reconfigure the data schema so that all of the supplemental data could be stored in a single table. For example, rather than having an events_meetings table that has a single row for each meeting event and a column for agenda, a column for minutes and a column for attendees, it’s possible to create an events_alt_data table that has a different row for each element and columns such as eventID, elementType, elementTitle and eventValue. Wordpress does something like this for unique data, but in my project the supplemental data sets are where the majority of the data will be stored and I’m concerned that it may affect performance as the data grows. This solution will also require some creative coding to deal with the conditional nature of data elements and how to validate and set options for data that could be any type or length.
Any advice?
Single Table Inheritance is the way to go.