Why isn't "union all" doing what I expect? - sql

I created 2 summary tables form the same source data for different date ranges.
Now that I have these multiple summary tables, I want to put those tables together
so that I will be able to run a summary on the combined table.
It's creating the summary table that is presenting the problem.
scratch.table_1 has 809,598 records.
scratch.table_2 has 1,228,176 records.
They both have the same set of fields from the source table,
plus a "record_number" field I created on each table using count(1).
The code I used to put these two tables together was:
create table scratch.table_1_and_2
select * from scratch.table_1
union all
select * from scratch.table_2
I assumed that there would be 809,598 + 1,228,176 records in the new table (2,037,774 records).
But there are only 1,960,769 records in the new table.
What am i doing wrong?

One way to troubleshoot would be to identify some of the missing records and see what might be different about the data in those that would cause them to be left out. A UNION ALL should include duplicate records so duplicates shouldn't be the issue. Maybe there is some data issue that's causing those records to be dropped. Also I'm assuming there isn't any funny business with Views going on in the underlying tables and that no data loads are affecting your record counts.

Related

SQL Select statement from mutiple tables to fill datagridview

I have multiple tables, the tables themselves are named after the date they were created on; so for example 4/01/2021, 5/01/2021.. etc
The tables contain all the same columns.
But I'd like to create a SQL statement that allows me to return all the tables that were created between two dates and fill a Datagridview with all the records in those tables.
Ideally I want a "Created Last Week", "Created This week", "Created This Month" options. I can work out th syntax for the start and end dates. But I'm not sure what the correct way is to return the tables that fall between the dates.
I have looked at a few examples but none seem to work for me or be exactly what I'm after. Not sure if I can use sys.tables or if I need to use inner joins/left join etc to get this to work.
My tables are in a Acccess.MDB file.
You will need a union query:
Use a union query to combine multiple queries into a single result
However, as the tables included will vary, you must create the SQL of the query dynamically and then call the query to fill your datagridview.
Note: This is a terrible setup. You should, at the soonest and as suggested by #June7, change your schema to have one table only with a field holding your dates (your current table names).

Update JOIN table contents

I have a table joined from two other tables. I would like this table to stay updated with entries in the other two tables.
First Table is "employees"
I am using the ID, Last_Name, and First_Name.
And the second Table is "EmployeeTimeCardActions"
using columns ID, ActionTime, ActionDate, ShiftStart, and ActionType.
ID is my common column that the join was created by..Joined Table...
Because I usually have a comment saying I did not include enough information, I do not need a exact specific code sample and I think I have included everything needed. If there is a good reason to include more I will, I just try to keep as little company information public as possible
Sounds like you're having your data duplicated across tables. Not a smart idea at all. You can update data in one table when a row is updated in a different one via triggers but this is a TERRIBLE approach. If you want to display data joined from 2 tables, the right approach here is using an SQL VIEW which will display the current data.

SQL to identify duplicate columns from table having hundreds of column

I've 250+ columns in customer table. As per my process, there should be only one row per customer however I've found few customers who are having more than one entry in the table
After running distinct on entire table for that customer it still returns two rows for me. I suspect one of column may be suffixed with space / junk from source tables resulting two rows of same information.
select distinct * from ( select * from customer_table where custoemr = '123' ) a;
Above query returns two rows. If you see with naked eye to results there is not difference in any of column.
I can identify which column is causing duplicates if I run query every time for each column with distinct but thinking that would be very manual task for 250+ columns.
This sounds like very dumb question but kind of stuck here. Please suggest if you have any better way to identify this, thank you.
Solving this one-time issue with sql is too much effort. Simply copy-paste to excel, transpose data into columns and use some simple function like "if a==b then 1 else 0".

Power Pivot relationships

Trying to create relationships (joins) between tables in power pivot.
Got 2 tables I wold like to join together, connected with a common column = CustomerID.
One is a Fact Table the other Dim table (look up).
I have run the "remove duplicates" on both tables without any problem.
But I still get an error saying : "the relationship cannot be created because each column contains duplicate values. Select at least one column that contains only unique values".
The Fact Table contains duplicates (as it should?) and the Dim Table do not, why do I get this error?
Help much appreciated
Created an appended table with both columns "CustomerID". After the columns where appended together I could "remove duplicates" and connect the tables together through the newly created appended table.
Don't know if this causes another problem later however.
You can also check for duplicate id values in a column by using the group by feature.
Remove all columns except ID, add a column that consists only of the number 1.
Group by ID, summing the content of the added column and filter out IDs whose total equals 1. What's left are duplicated IDs.

Oracle SQL merge tables without specifying columns

I have a table people with less than 100,000 records and I have taken a backup of this table using the following:
create table people_backup as select * from people
I add some new records to my people table over time, but eventually I want to merge the records from my backup table into people. Unfortunately I cannot simply DROP my table as my new records will be lost!
So I want to update the records in my people table using the records from people_backup, based on their primary key id and I have found 2 ways to do this:
MERGE the tables together
use some sort of fancy correlated update
Great! However, both of these methods use SET and make me specify what columns I want to update. Unfortunately I am lazy and the structure of people may change over time and while my CTAS statement doesn't need to be updated, my update/merge script will need changes, which feels like unnecessary work for me.
Is there a way merge entire rows without having to specify columns? I see here that not specifying columns during an INSERT will direct SQL to insert values by order, can the same methodology be applied here, is this safe?
NB: The structure of the table will not change between backups
Given that your table is small, you could simply
DELETE FROM table t
WHERE EXISTS( SELECT 1
FROM backup b
WHERE t.key = b.key );
INSERT INTO table
SELECT *
FROM backup;
That is slow and not particularly elegant (particularly if most of the data from the backup hasn't changed) but assuming the columns in the two tables match, it does allow you to not list out the columns. Personally, I'd much prefer writing out the column names (presumably those don't change all that often) so that I could do an update.