I have 2 existing applications that I'd like to bridge somehow. Both have similar domains (Product Catalogs), but the first application uses a NoSQL document store for records, and the 2nd application uses SQL tables.
An example record from the first table looks something like:
{
"productId": 123,
"sku": "abc",
"packageSizes": {
"container": 20,
"pallet": 50
}
}
Whereas the same item in the 2nd domain would be 1 row in the ProductItem table:
| id | sku |
| 123 | abc |
Then 2 rows in the ProductPackageSizes table:
| productId | type | size |
| 123 | container | 20 |
| 123 | pallet | 50 |
The systems currently are completely independent, but I'd like it so that whenever a record is created in the NoSQL application to have the same item created in the SQL application.
I can write a one off script for this, that just creates it procedurally based on what the data looks like currently. However, I would be interested to know if there are any established design patterns to describe such transformations? Particularly if there are new packageSizes or other relations added in the future.
You do not have any foreign keys in the sql database?
you can use a foreign key in ProductPackageSizes table that related with ProductItem and next in the every time that you want create a new record in the Nosql database you can assign
"productId": 123,"sku": "abc"
in the ProductItem , like :
| id | sku |
| 123 | abc |
and assign
"packageSizes": {
"container": 20,
"pallet": 50 }
to ProductPackageSizes with a foreign key (like P_id) , like :
| productId | type | size |P_id
| 123 | container | 20 |123
| 123 | pallet | 50 |123
Hope this helps :)
Related
I am doing an integration with a customer's ERP. The database tables have a normalization so that the columns that have the same name throughout any table, must have the same data type.
With this premise, I would like to generate a SQL, or a stored procedure that drags data from several source tables in a given order always matching the column names, to 2 target tables. As it is highly probable that the ERP vendor will add new columns without notifying my department, I need the columns to be obtained dynamically.
All this is to generate a single record in a table (in this case, the head data of a purchase to a supplier), and several rows in another table (the items of the purchase).
My idea is to have an auxiliary table where I put the information coming from my system, and then, execute that SQL/procedure to consolidate the information into the ERP purchase tables.
Let's take an example.
My tables would have information similar to this
(Purchase header)
ExternalOrderId | SupplierCode | PurchaseDate | PurchaseStatus | FiscalYear | Series
--------------------------------------------------------------------------------------------
ABCD | 00001 | 2021-12-11 12:00:00 | DRAFT | 2021 | S
(Purchase items)
ExternalOrderId | ArticleCode | ItemOrder | Units
--------------------------------------------------
ABCD | 1234 | 1 | 2
ABCD | 2345 | 5 | 4
ABCD | 3456 | 10 | 10
ABCD | 1234 | 15 | 3 (very important, same article can be repeated multiple times in one purchase)
.....
ABCD | 9999 | 100 | 10
Very important step is to take fiscal year, series and number from a table of counters. The counter should be incremented after the process.
Example of table "Counters" (note that there may be several numbers for one type depending on the series and the exercise):
Type | FiscalYear | Series | LastNumber
----------------------------------------------------
SupplierPurchase | 2021 | S | 26
SupplierPurchase | 2021 | A | 60
SupplierPurchase | 2021 | B | 15
SaleOrder | 2021 | S | 19
SaleOrder | 2021 | X | 200
Table "Accounting data".
SupplierCode | AdditionalColumn1 | AdditionalColumn2 | AdditionalColumn3
-------------------------------------------------------------------------
00001 | AC1A | AC2A | AC3A
Table "Company data".
SupplierCode | AdditionalColumn2 | AdditionalColumn3 | AdditionalColumn4
-------------------------------------------------------------------------
00001 | AC2B | AC3B | AC4B
Table "Supplier data".
SupplierCode | AdditionalColumn3 | AdditionalColumn5
-----------------------------------------------------
00001 | AC3C | AC5C
In this case the result should be something like this: for the columns with the same name, the data coming from the last table read should be kept. For example, AdditionalColumn1, will have the value of the first table (AC1A) because is the only table with that column name, and in the case of AdditionalColumn3, the data from the last one (AC3C).
The final result should look something like this:
Purchase Header
FiscalYear | Series | Number | SupplierCode | AdditionalColumn1 | AdditionalColumn2 | AdditionalColumn3 | AdditionalColumn4 | AdditionalColumn5 | PurchaseStatus | PurchaseDate | ExternalPurchaseID
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2021 | S | 27 | 00001 | AC1A | AC2B | AC3C | AC4B | AC5C | DRAFT | 2021-10-11 12:00:00:00 | ABCD
Note that the purchase number is 27, because in the counters table the last number used for the series "S" was 26. After creating this row, the counter must be set to 27.
In the case of the purchase items, it would be the same, obtaining the data from:
The purchase header created in the previous step.
Data from the Articles table
Data from another table with additional information about the articles.
The data from the purchase items table that I generated earlier.
But in this case, instead of being a single record, it will be a record for each item that I reflect in my auxiliary table, matching the info by the item's "ArticleCode".
I could do all this through programmed code, but I would like to abstract from the programming language and include all this in the database logic, to make a very fast, transactional process that can be retried in case of failure. Besides, as I said, they will be dynamic columns, since the ERP provider will be able to create new columns. In this way, I will not have to worry about having to escape the information of possible unicode characters and I will be sure that the data types are respected at all times.
It would be nice if i can get a boolean flag marked on my auxiliary table to indicate that the purchase has been consolidated correctly.
Thanks in advance
EDIT
As #JeroenMostert said in one response this question is too vague. The purpose of my question is to know how to use the column names obtained, for example from INFORMATION_SCHEMA.COLUMNS, from a table A and use them in a query, but only the ones that intersect with the columns of a table B, and do it several times with several tables so that I can generate the header of the purchase. And then use the same process (and the resulting data) to generate the purchase rows.
I have two Excel tables- the first one is the data table and the second one is a look up table. Here is how they are structured-
Data Table
+----------+-------------+----------+----------+
| Category | Subcategory | Division | Business |
+----------+-------------+----------+----------+
| A | Red | Home | Q |
| B | Blue | Office | R |
| C | Green | City | S |
| D | Yellow | State | T |
| D | Red | State | T |
| D | Green | Office | Q |
+----------+-------------+----------+----------+
Lookup Table Lookup Table
+----------+-------------+----------+----------+--------------+
| Category | Subcategory | Division | Business | LookUp Value |
+----------+-------------+----------+----------+--------------+
| 0 | 0 | 0 | Q | ABC |
| B | 0 | Office | 0 | DEF |
| C | Green | 0 | 0 | MNO |
| D | 0 | State | T | RST |
+----------+-------------+----------+----------+--------------+
So I want to add the lookup value column to the data table based on the criteria given in the lookup table. Eg, for the first row in the lookup table, I dont want to lookup on Category, Subcategory, or Division. but if the Business is Q, then I want to populate the lookup value as ABC. Similarly, for the second row I dont want to consider the Subcategory. and Business. but if the Category. is "B" and Division is "Office", I want it to populate DEF. So the result should look like this-
[Final Resulting Data Table]
+----------+-------------+----------+----------+--------------+
| Category | Subcategory | Division | Business | LookUp Value |
+----------+-------------+----------+----------+--------------+
| A | Red | Home | Q | ABC |
| B | Blue | Office | R | DEF |
| C | Green | City | S | MNO |
| D | Yellow | State | T | RST |
| D | Red | State | T | RST |
| D | Green | Office | Q | ABC |
+----------+-------------+----------+----------+--------------+
I am very new to SQL and the actual data set is very complex wih multiple lookup values based on different criteria. IF you think any other scripting language would work better, I am open to that too. My data is in Excel currently
If the data is so complex, you should first consider if you want to put it in a (relational) database (like MS Access, MySQL, etc.) instead of in a spreadsheet (like MS Excel).
Both kind of programs are used for structured data handling, but databases focus primarily on efficient data storage and data integrity (including guarding type safety, required fields, unique fields, required references between various datasets/tables, etc.) and spreadsheets focus primarily on data analysis and calculations.
Relational databases support Structured Query Language (SQL) to let clients query their data. Spreadsheets normally do not use or support SQL (as far as I know).
It is possible to let MS Excel import or reference data in an external data source (like a relational database) to perform analysis and calculations on it.
The other way around is (sometimes) possible too: to link to spreadsheet worksheets as external tables inside a relational database system to - within certain limits - allow that data to be queried using SQL. But using a database to store the data and a spreadsheet (as a database client) to perform analysis on the data in the database would be a more logical design in my opinion.
However, creating such an integrated solution using multiple MS Office applications and/or external databases can be a complex challenge, especially when you are just starting to learn about them.
To be honest, I am not experienced with designing MS Office based solutions, so I cannot guide you around any pitfalls. I do hope, that this answer helps you a little with finding the right way to go here...
I have a table that contains the history of Customer IDs that have been merged in our CRM system. The data in the historical reporting Oracle schema exists as it was when the interaction records were created. I need a way to find the Current ID associated with a customer from potentially an old ID. To make this a bit more interesting, I do not have permissions to create PL/SQL for this, I can only create Select statements against this data.
Sample Data in customer ID_MERGE_HIST table
| OLD_ID | NEW_ID |
+----------+----------+
| 44678368 | 47306920 |
| 47306920 | 48352231 |
| 48352231 | 48780326 |
| 48780326 | 50044190 |
Sample Interaction table
| INTERACTION_ID | CUST_ID |
+----------------+----------+
| 1 | 44678368 |
| 2 | 48352231 |
| 3 | 80044190 |
I would like a query with a recursive sub-query to provide a result set that looks like this:
| INTERACTION_ID | CUST_ID | CUR_CUST_ID |
+----------------+----------+-------------+
| 1 | 44678368 | 50044190 |
| 2 | 48352231 | 50044190 |
| 3 | 80044190 | 80044190 |
Note: Cust_ID 80044190 has never been merged, so does not appear in the ID_MERGE_HIST table.
Any help would be greatly appreciated.
You can look at CONNECT BY construction.
Also, you might want to play with recursive WITH (one of the descriptions: http://gennick.com/database/understanding-the-with-clause). CONNECT BY is better, but ORACLE specific.
If this is frequent request, you may want to store first/last cust_id for all related records.
First cust_id - will be static, but will require 2 hops to get to the current one
Last cust_id - will give you result immediately, but require an update for the whole tree with every new record
Sorry, if this a rather basic question but I'm a SQL Server noob in need of help.
I have 2 types of loan providers, Lender and Pingtree.
Both Lender and Pingtree can have a relationship with MatchService, which would need to be able to store their ID.
At the moment I'm struggling to work out how I can create a relationship between them. To demonstrate, I've created a simple visual of what I want to do in the real world (ringed in red) and what I think could be a possible solution in SQL Server. In essence Lender and Pingtree would have a ProviderId and this would be the ID also stored in the Match table
All advice appreciated.
If I were designing this table I would use a Provider to Match table, and store the common attributes in the Provider table. For the non-common attributes, I would create them as a name/value pair table that can link back to the provider Id.
edit: added sample of data structure.
MatchService (Key would be MatchId + ProviderId)
|MatchId |ProviderId|
---------------------------
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 1 |
Provider (Key would be ProviderId)
|ProviderId |ProviderType |ProviderName |StartDateTime | EndDateTime |
--------------------------------------------------------------------------
|1 |Lender |Stark Ind. |1/1/2013 00:00|1/1/2014 00:00|
|2 |Pingtree |MoneyBags |1/1/2013 00:00|1/1/2014 00:00|
Name/Value Pair Table (For Unique, Key would be ProviderId + Name)
| ProviderId | Name | Value |
-------------------------------------------------------
| 1 | PointOfContact | Tony Stark |
| 1 | Contact Phone Number | 101-202-3456 |
| 2 | Customer Service Number | 402-123-4567 |
I have a project with a MySQL database, and I would like to be able to upload various datasets. Say I am building a restaurant reviews aggregator. So we would like to keep adding all sources of restaurant reviews we could get our hands on, and keeping all the information.
I have a table review_sources
=========================
| id | name |
=========================
| 1 | Zagat |
| 2 | GoodEats Magazine|
| ... |
| 50 | Allergy News |
=========================
Now say I have a table reviews
=====================================================================
| id | Restaurant Name | source_id | Star Rating | Description |
=====================================================================
| 0 | Joey's Burgers | 1 | 3.5 | Wow! |
| 1 | Jamal's Steaks | 1 | 3.5 | Yummy! |
| 2 | Jenny's Crepes | 1 | 4.5 | Sweet! |
| .... |
| 253| Jeeva's Curries | 3 | 4 | Spicy! |
=====================================================================
Now suppose someone wants to add reviews from "Allergy News", they have a field "nut-free". Or a source of reviews could describe the degree of kashrut compliance, or halal compliance or vegan-friendliness. I as a designer don't know the possible optional fields future data sources may have. I want to be able to answer queries:
What are all the fields in the Zagat reviews?
For review id=x, what is value of the optional field "vegan-friendly"?
So how do I design a schema that can handle these disparate data sources and answer these queries? My reasons for not going for NoSQL are that I do want certain types of normalization, and that this is part of an existing MySQL based project.
I'd use a many-to-many relationship with a table containing a review_id, a field (e.g. "vegan-friendly") and the value of the field. Then of course a reviews_fields table to map one to the other.
Cheers