How do I find the names of recently deleted tables in BigQuery - google-bigquery

I accidentally deleted the wrong dataset in Bigquery. I know how to recover the tables that I can remember the name of, but I can't remember all the tables that were there.
How can I find the names of the deleted tables?

Related

How to assign the IDs to the referring table and how to display this correctly? (SSMS)

I am in the process of creating an audit plan using ERD, going off the below image you can see that there's a permissions table with four FK columns referring to the other four tables PK column. I am just confused as to how the IDs will relate to the other tables and how will it show up correctly in the permissions table?
For the Users table, I imported the data from 'master.sys.server_principals.
For the Instance table, I imported the data by using ##SERVERNAME.
For the Databases table, I imported the data from master.sys.databases.
For the Object Types table, I imported the data from master.sys.objects.
Now, I am currently on the permissions table and stuck at this point because I am wondering how will the IDs match from the four other tables (mentioned above and shown in the image link below) to this permissions table. I know I need to query from master.sys.database_permissions to get the information for both columns 'Permissions_Permission_Name' and 'Permissions_Object_Name' but it's just the other four ID columns which I am confused about...(you can ignore the column Permissions_ID)
I'm going to use the Answer field, because there is no space in the Comment editor. This answer is an answer to only part your question, two of the four tables (Databases and Users) I can relate to system tables.
First and foremost: when filling in Id's, you would generate the other table records first, keep the Identity Id's generated, and finally create a new Permission record and fillin the correct indexes there, in each Id field. That counts for any such change when a table contains indexes to other tables. Suppose you know.
Issue is, your structure differs from the system tables. You will need more "permission" records than master.sys.database_permissions, because MsSQL registers these as permissions per principal (role) not permissions per user.
I solved two of the four:
The user is connected to a principal role via master.sys.database_role_members. The Id of the user role can be found in your source as master.sys.database_permissions.grantee_principal_id and the corresponding users that have this principal_id are listed in master.sys.database_role_members.
Your permission a database (ONE database) is defined in your Permission record. The database name in this database record should map to a database on your server. In that database, you will find database_permissions.sys.server_principals. users that have the permissions are (again) found in master.sys.database_role_members.
I'm not sure what you intend to do with the other 2 tables, Instances and Object Types.
Refer ms-docs about the subject at https://learn.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-database-permissions-transact-sql?view=sql-server-ver15

Should I apply type 2 history to tables with duplicate keys?

I'm working on a data warehouse project using BigQuery. We're loading daily files exported from various mainframe systems. Most tables have unique keys which we can use to create the type 2 history, but some tables, e.g. a ledger/positions table, can have duplicate rows. These files contain the full data extract from the source system every day.
We're currently able to maintain a type 2 history for most tables without knowing the primary keys, as long as all rows in a load are unique, but we have a challenge with tables where this is not the case.
One person on the project has suggested that the way to handle it is to "compare duplicates", meaning that if the DWH table has 5 identical rows and the staging tables has 6 identical rows, then we just insert one more, and if it is the other way around, we just close one of the records in the DWH table (by setting the end date to now). This could be implemented by adding and extra "sub row" key to the dataset like this:
Row_number() over(partition by “all data columns” order by SystemTime) as data_row_nr
I've tried to find out if this is good practice or not, but without any luck. Something about it just seems wrong to me, and I can't see what unforeseen consequences can arise from doing it like this.
Can anybody tell me what the best way to go is when dealing with full loads of ledger data on a daily basis, for which we want to maintain some kind of history in the DWH?
No, I do not think this would be a good idea to introduce an artificial primary key based on all columns plus the index of the duplicated row.
You will solve the technical problem, but I doubt there will be some business value.
First of all you should distinct – the tables you get with primary key are dimensions and you can recognise changes and build history.
But the table without PK are most probably fact tables (i.e. transaction records) that are typically not full loaded but loaded based on some DELTA criterion.
Anyway you will never be able to recognise an update in those records, only possible change is insert (deletes are typically not relevant as data warehouse keeps longer history that the source system).
So my todo list
Check if the dup are intended or illegal
Try to find a delta criterion to load the fact tables
If everything fails, make the primary key of all columns with a single attribute of the number of duplicates and build the history.

Eliminate duplicates automatically from table

Table will be getting new data everyday from source system and i want the duplicates to be deleted automatically as soon as new data gets loaded to table.
Is it possible in bigquery?
I tried to create a view named sites_view in bigquery with below query
SELECT DISTINCT * FROM prd.sites
but duplicates not getting deleted automatically.
Below is for BigQuery:
Duplicates will not be deleted automatically - there is no such functionality in BigQuery
You should have some process to make this happen as frequently as you need or use views
Bigquery is based on append-only kind of a design. So, it accepts all the data.
This is one of the reasons there are no Primary/Unique key constraints on it, so you can't prevent duplicates from entering in the table.
So, you have to have a process like:
1.) Create a new table without duplicates from your original table.
(You can use DISTINCT/ROW_NUMBER() for doing this.)
2.) Drop original table.
3.) Rename new table with original table name.
Let me know if this information helps.

Adding record with new foreign key

I have few tables to store company information in my database, but I want to focus on two of them. One table, Company, contains CompanyID, which is autoincremented, and some other columns that are irrelevant for now. The problem is that companies use different versions of names (e.g. IBM vs. International Business Machines) and I want/need to store them all for futher use, so I can't keep names in Company table. Therefore I have another table, CompanyName that uses CompanyID as a foreign key (it's one-to-many relation).
Now, I need to import some new companies, and I have names only. Therefore I want to add them to CompanyName table, but create new records in Company table immediately, so I can put right CompanyID in CompanyName table.
Is it possible with one query? How to approach this problem properly? Do I need to go as far as writing VBA procedure to add records one by one?
I searched Stack and other websites, but I didn't find any solution for my problem, and I can't figure it out myself. I guess it could be done with form and subform, but ultimately I want to put all my queries in macro, so data import would be done automatically.
I'm not database expert, so maybe I just designed it badly, but I didn't figure out another way to cleanly store multiple names of the same entity.
The table structure you setup appears to be a good way to do this. But there's not a way to insert records into both tables at the same time. One option is to write two queries to insert records into Company and then CompanyName. After inserting records into Company you will need to create a query that joins from the source table to the Company table joining it on a field that uniquely defines the record beside the autoincrement key. That will allow you to get the key field from Company for use when you insert into CompanyName.
The other option, is to write some VBA code to loop through the source data inserting records into both. The would be preferable since it should be more reliable.

To find out the User name or ID who updated the address of the other staff

This is my first post on this forum and hope I will get an answer.
I have very limited info with me about my database.
The query is like:
I wanted to know who has updated the address of the other staff, surely it is updated from the Java based application, but I came to know that in my database I have audit schema and in that I can find out the user name who updated the address.
But I don't know in which table this information will be available as we have around 1000+ tables in my database.
Could you please assist me to find out the exact table where this info will be available.
Aija, this is a difficult question to answer as there are so many possibilities, however we maybe able to help you narrow it down. Tables like this often start with audit, history, change, etc. or the reverse and have that appended to the end of the file they are tracking. E.g. audit_personnel or personnel_change. You say you have 1,000+ tables. That is a lot, but I have worked with bigger. It is still feasible to go through the list by the name of the table one by one. When databases get this big, naming standards come into play. Have a look at the way the table names are put together, and you will be able to narrow down your search a lot.
Thanks for your input
i have gone throgh all the tables by name starting/ending with name audit .i found one table audit trail in that there are multiple tables but i could not able to find the info which is expected.
even iam not sure these tables are coming under my prebvillage or its in under sys or any other user.
Another option then is the single table audit control. In this style, the table has 4 major components. First the data being changed which will be something like the table and field, maybe recid. Second is the original data. Third is the new data. Fourth is the who and when of the change. If this is the style, then you will need to know which table it is that you want to track. Then you will need something like "select * from [audit table] where [audit table].[monitored table] = [target table]".