SQL Server: create rows with info from multiple tables having the same column name - sql

I am doing an integration with a customer's ERP. The database tables have a normalization so that the columns that have the same name throughout any table, must have the same data type.
With this premise, I would like to generate a SQL, or a stored procedure that drags data from several source tables in a given order always matching the column names, to 2 target tables. As it is highly probable that the ERP vendor will add new columns without notifying my department, I need the columns to be obtained dynamically.
All this is to generate a single record in a table (in this case, the head data of a purchase to a supplier), and several rows in another table (the items of the purchase).
My idea is to have an auxiliary table where I put the information coming from my system, and then, execute that SQL/procedure to consolidate the information into the ERP purchase tables.
Let's take an example.
My tables would have information similar to this
(Purchase header)
ExternalOrderId | SupplierCode | PurchaseDate | PurchaseStatus | FiscalYear | Series
--------------------------------------------------------------------------------------------
ABCD | 00001 | 2021-12-11 12:00:00 | DRAFT | 2021 | S
(Purchase items)
ExternalOrderId | ArticleCode | ItemOrder | Units
--------------------------------------------------
ABCD | 1234 | 1 | 2
ABCD | 2345 | 5 | 4
ABCD | 3456 | 10 | 10
ABCD | 1234 | 15 | 3 (very important, same article can be repeated multiple times in one purchase)
.....
ABCD | 9999 | 100 | 10
Very important step is to take fiscal year, series and number from a table of counters. The counter should be incremented after the process.
Example of table "Counters" (note that there may be several numbers for one type depending on the series and the exercise):
Type | FiscalYear | Series | LastNumber
----------------------------------------------------
SupplierPurchase | 2021 | S | 26
SupplierPurchase | 2021 | A | 60
SupplierPurchase | 2021 | B | 15
SaleOrder | 2021 | S | 19
SaleOrder | 2021 | X | 200
Table "Accounting data".
SupplierCode | AdditionalColumn1 | AdditionalColumn2 | AdditionalColumn3
-------------------------------------------------------------------------
00001 | AC1A | AC2A | AC3A
Table "Company data".
SupplierCode | AdditionalColumn2 | AdditionalColumn3 | AdditionalColumn4
-------------------------------------------------------------------------
00001 | AC2B | AC3B | AC4B
Table "Supplier data".
SupplierCode | AdditionalColumn3 | AdditionalColumn5
-----------------------------------------------------
00001 | AC3C | AC5C
In this case the result should be something like this: for the columns with the same name, the data coming from the last table read should be kept. For example, AdditionalColumn1, will have the value of the first table (AC1A) because is the only table with that column name, and in the case of AdditionalColumn3, the data from the last one (AC3C).
The final result should look something like this:
Purchase Header
FiscalYear | Series | Number | SupplierCode | AdditionalColumn1 | AdditionalColumn2 | AdditionalColumn3 | AdditionalColumn4 | AdditionalColumn5 | PurchaseStatus | PurchaseDate | ExternalPurchaseID
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2021 | S | 27 | 00001 | AC1A | AC2B | AC3C | AC4B | AC5C | DRAFT | 2021-10-11 12:00:00:00 | ABCD
Note that the purchase number is 27, because in the counters table the last number used for the series "S" was 26. After creating this row, the counter must be set to 27.
In the case of the purchase items, it would be the same, obtaining the data from:
The purchase header created in the previous step.
Data from the Articles table
Data from another table with additional information about the articles.
The data from the purchase items table that I generated earlier.
But in this case, instead of being a single record, it will be a record for each item that I reflect in my auxiliary table, matching the info by the item's "ArticleCode".
I could do all this through programmed code, but I would like to abstract from the programming language and include all this in the database logic, to make a very fast, transactional process that can be retried in case of failure. Besides, as I said, they will be dynamic columns, since the ERP provider will be able to create new columns. In this way, I will not have to worry about having to escape the information of possible unicode characters and I will be sure that the data types are respected at all times.
It would be nice if i can get a boolean flag marked on my auxiliary table to indicate that the purchase has been consolidated correctly.
Thanks in advance
EDIT
As #JeroenMostert said in one response this question is too vague. The purpose of my question is to know how to use the column names obtained, for example from INFORMATION_SCHEMA.COLUMNS, from a table A and use them in a query, but only the ones that intersect with the columns of a table B, and do it several times with several tables so that I can generate the header of the purchase. And then use the same process (and the resulting data) to generate the purchase rows.

Related

Create and display table column hierarchy in Tableau

My table currently has a number of similar numerical columns I'd like to nest under a common label.
My current table is something like:
| Week | Seller count, total | Seller count, churned | Seller count, resurrected |
| ---- | ------------------- | --------------------- | ------------------------- |
| 1 | 100 | 10 | 4 |
| 2 | 105 | 12 | 5 |
And I'd like it to be:
| | Seller count |
| Week | Total | Churned | Resurrected |
| ---- | ----- | ------- | ----------- |
| 1 | 100 | 10 | 4 |
| 2 | 105 | 12 | 5 |
I've seen examples of this, including a related instructional video, but this video hides the actual creation of the nested object (called "Segment").
I also tried creating a hierarchy by dragging items in the "Data" tab on top of one another. This function appears to only be possible for dimensions (categorical data), not measures (numerical data) like mine.
Even so, I can drag my column names from the measures side onto the dimensions side to get them to be considered dimensions. Then I can drag to nest and create the hierarchy. But then when I drag the top item of the hierarchy ("Seller count" in the example below) into the "Columns" field, I get the warning "the field being added contains 92,000 members, and maximum recommended is 1,000". It thinks this is categorical data, and is maybe planning to create a subheading for each value (100, 105, etc.), instead of the desired hierarchy sub-items as subheadings.
Any idea how to accomplish this simple hierarchical restructuring of my column labels?
Actually, this is some data restructuring and Tableau isn't best suited for it. Still, it is simple one and you can do it like this-
I recreated one table like yours in excel, and imported it in Tableau
Rename the three cols, (removed seller count from their names)
selected these three columns at once, and select pivot to transform these like
Rename these columns again
create a text table in tableau, as you have shown in question

Editing a row in a database table affects all previous records that query that information. How should prior versions be stored/managed?

I’ve been working on a Windows Form App using vb.net that retrieves information from a SQL database. One of the forms, frmContract, queries several tables, such as Addresses, and displays them in various controls, such as Labels and DataGridViews. Every year, the customer’s file is either renewed or expired, and I’m just now realizing that a change committed to any record today will affect the information displayed for the customer in the past. For example, if we update a customer’s mailing address today, this new address will show up in all previous customer profiles. What is the smartest way to avoid this problem without creating separate rows in each table with the same information? Or to put it another way, how can versions of a customer’s profile be preserved?
Another example would be a table that stores customer’s vehicles.
VehicleID | Year | Make | Model | VIN | Body
---------------------------------------------------------------
1 | 2005 | Ford | F150 | 11111111111111111 | Pickup
2 | 2001 | Niss | Sentra | 22222222222222222 | Sedan
3 | 2004 | Intl | 4700 | 33333333333333333 | Car Carrier
If today vehicle 1 is changed from a standard pickup to a flatbed, then if I load the customer contract from 2016 it will also show as flatbed even though back then it was a pickup truck.
I have a table for storing individual clients.
ClientID | First | Last | DOB
---------|----------|-----------|------------
1 | John | Doe | 01/01/1980
2 | Mickey | Mouse | 11/18/1928
3 | Eric | Forman | 03/05/1960
I have another table to store yearly contracts.
ContractID | ContractNo | EffectiveDate | ExpirationDate | ClientID (foreign key)
-----------|------------|---------------|-------------------|-----------
1 | 13579 | 06/15/2013 | 06/15/2014 | 1
2 | 13579 | 06/15/2014 | 06/15/2015 | 1
3 | 24680 | 10/05/2016 | 10/05/2017 | 3
Notice that the contract number can remain the same across different periods. In addition, because the same vehicle can be related to multiple contracts, I use a bridge table to relate individual vehicles to different contracts.
Id | VehicleID | ContractID <-- both foreign keys
---|-----------|------------
1 | 1 | 1
2 | 3 | 1
3 | 1 | 2
4 | 3 | 2
5 | 2 | 3
6 | 2 | 2
When frmContract is loaded, it queries the database and displays information about that particular contract year. However, if Vehicle 1 is changed from pickup to flatbed right now, then all the previous contract years will also show it as a flatbed.
I hope this illustrates my predicament. Any guidance will be appreaciated.
Some DB systems have built-in temporal features so you can keep audit history of rows. Check to see if your DB has built-in support for this.

Efficiently reconcile changing identifiers in SQL?

I am working with data where the user identifier changes. The user identifiers are GUIDs, so shouldn't repeat across different users. When the identifier changes, I am provided with the old user identifier and the current user identifier on the same row in a table. I need to reconcile these values and have them both assigned to the same database-generated integer ID, which is the value I use to refer to the user elsewhere in the database.
Not long ago, the user identifiers would not change. I had the following setup:
users table
id | identifier
---------------
1 | ABC
2 | DEF
etc ...
activity table
id | timestamp | identifier | other_data
---------------------------------------------
...
29 | 1 | ABC | more data
30 | 2 | ABC | even more data
31 | 3 | ABC | etc
32 | 4 | DEF | etc
33 | 5 | DEF | etc
34 | 6 | ABC | more data
...
My goal remains to aggregate activity from the activity table into an activity_daily table. In the prior setup, that was relatively simple because I could expect that the identifier was consistent per user.
My output aggregate activity_daily table had the structure:
id | user_id | date | other_stuff
--------------------------------------
1 | 1 | 9/10/2017 | etc
2 | 1 | 9/11/2017 | etc
3 | 2 | 9/08/2017 | etc
4 | 2 | 9/09/2017 | etc
5 | 1 | 9/12/2017 | etc
...
Now, however, the activity table has changed. For the first activity record where a identifier changes, I get a value in a column called identifier_old. The activity table now looks like the following:
activity table
id | timestamp | identifier | identifier_old | other_data
-------------------------------------------------------------------
...
29 | 110 | ABC | | more data
30 | 111 | GHI | ABC | other data
31 | 112 | GHI | | etc
32 | 114 | DEF | | etc
33 | 115 | DEF | | etc
34 | 116 | JKL | DEF | etc
35 | 117 | GHI | | etc
36 | 118 | JKL | | etc
37 | 119 | JKL | | etc
38 | 120 | GHI | | etc
...
Now, my need to to create the same aggregate activity_daily table with the added complexity of mapping the identifier and identifier_old to the same integer id in the users table.
Each day, somewhere around 10 million records are loaded into the activity table that have to be reconciled and aggregated. There are millions of unique identifiers, so I'm trying to keep the reconciling of the identifiers and the aggregation steps as efficient as possible.
I've had two thoughts about how to approach this, but neither seem particularly efficient when considering the aggregation and joins on the activity table.
1) Create an identifiers table with columns id, identifier, and user_id. The users table no longer stores the identifier. Then do the following: a) check if the identifier_old is in the identifiers table. If not, add it and create an entry in the users table to generate an id. Add that id to the proper record in the identifiers table. b) Look in the activity table at records that have a value both in identifier and old_identifier. Add the identifier from those records to the identifiers table and then update those records with the appropriate user_id values from the old_identifier values that are already in the identifier table. c) Do my aggregation, etc. based on the identifier column in the activity table.
2) Similar, but don't maintain a separate identifiers table. Instead, add a third column to the users table called user_static_id (or something). All identifier values go into the users table but those that refer to the same person share the same user_static_id and the aggregate table has a foreign key for user_static_id instead of for the id column in the users table.
Neither of these seem like a great approach and both seem like they could significantly slow down the reconciliation and aggregation process.
Note: I cannot say, for certain, that the changed identifier values won't revert back to their previous values. For each user, they may continue to change periodically, they may revert, or they may remain static forever. The timestamp column in the activity table allows me to sort the records so that I don't end up encountering records with a new identifier before I encounter records that have both identifier and identifier_old.
It's also worth noting that the activity table is flushed after the aggregation has occurred.
Given this scenario, what is the most efficient way to handle this problem?

updating in access without having any connection between tables?

I have one table:
Item Table
name | Total | Remaining
chairs | 10 | 0
and another table
Even entry Table
eventname | Chairs
birthday | 10
Party | 20
these two tables don't have any relationship(I know it's bad) ...my goal is to sum total of chairs in event table and put them in item field named Remaining sorry im new in access.
i.e
name | Total | Remaining
chairs | 10 | 30
INSERT INTO Item (Remaining) SELECT SUM(EventEntry.Chairs) FROM EventEntry;

Retrieve comma delimited data from a field

I've created a form in PHP that collects basic information. I have a list box that allows multiple items selected (i.e. Housing, rent, food, water). If multiple items are selected they are stored in a field called Needs separated by a comma.
I have created a report ordered by the persons needs. The people who only have one need are sorted correctly, but the people who have multiple are sorted exactly as the string passed to the database (i.e. housing, rent, food, water) --> which is not what I want.
Is there a way to separate the multiple values in this field using SQL to count each need instance/occurrence as 1 so that there are no comma delimitations shown in the results?
Your database is not in the first normal form. A non-normalized database will be very problematic to use and to query, as you are actually experiencing.
In general, you should be using at least the following structure. It can still be normalized further, but I hope this gets you going in the right direction:
CREATE TABLE users (
user_id int,
name varchar(100)
);
CREATE TABLE users_needs (
need varchar(100),
user_id int
);
Then you should store the data as follows:
-- TABLE: users
+---------+-------+
| user_id | name |
+---------+-------+
| 1 | joe |
| 2 | peter |
| 3 | steve |
| 4 | clint |
+---------+-------+
-- TABLE: users_needs
+---------+----------+
| need | user_id |
+---------+----------+
| housing | 1 |
| water | 1 |
| food | 1 |
| housing | 2 |
| rent | 2 |
| water | 2 |
| housing | 3 |
+---------+----------+
Note how the users_needs table is defining the relationship between one user and one or many needs (or none at all, as for user number 4.)
To normalise your database further, you should also use another table called needs, and as follows:
-- TABLE: needs
+---------+---------+
| need_id | name |
+---------+---------+
| 1 | housing |
| 2 | water |
| 3 | food |
| 4 | rent |
+---------+---------+
Then the users_needs table should just refer to a candidate key of the needs table instead of repeating the text.
-- TABLE: users_needs (instead of the previous one)
+---------+----------+
| need_id | user_id |
+---------+----------+
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 1 | 2 |
| 4 | 2 |
| 2 | 2 |
| 1 | 3 |
+---------+----------+
You may also be interested in checking out the following Wikipedia article for further reading about repeating values inside columns:
Wikipedia: First normal form - Repeating groups within columns
UPDATE:
To fully answer your question, if you follow the above guidelines, sorting, counting and aggregating the data should then become straight-forward.
To sort the result-set by needs, you would be able to do the following:
SELECT users.name, needs.name
FROM users
INNER JOIN needs ON (needs.user_id = users.user_id)
ORDER BY needs.name;
You would also be able to count how many needs each user has selected, for example:
SELECT users.name, COUNT(needs.need) as number_of_needs
FROM users
LEFT JOIN needs ON (needs.user_id = users.user_id)
GROUP BY users.user_id, users.name
ORDER BY number_of_needs;
I'm a little confused by the goal. Is this a UI problem or are you just having trouble determining who has multiple needs?
The number of needs is the difference:
Len([Needs]) - Len(Replace([Needs],',','')) + 1
Can you provide more information about the Sort you're trying to accomplish?
UPDATE:
I think these Oracle-based posts may have what you're looking for: post and post. The only difference is that you would probably be better off using the method I list above to find the number of comma-delimited pieces rather than doing the translate(...) that the author suggests. Hope this helps - it's Oracle-based, but I don't see .