Gather single rows from multiple tables in Microsoft Access - sql

I have several tables in Microsoft Access 2013, all of which follow the same format of:
ID | Object | Person 1 | Person 2 | Person 3 |
ID | String | Yes/No | Yes/No | Yes/No |
What I would like to do is make a query where I put in a string value for each table and it prints out the entire row, with each string getting its own row, so it looks like:
ID Number | Object | Person 1...
Table 1 ID | Table 1 String | Table 1 Yes/No...
Table 2 ID | Table 2 String | Table 2 Yes/No...
Every time I try, though, it puts all the data into one extremely long row that's impossible to look at. All of my searching has only turned up people trying to do the exact opposite of what I'm doing, though, so I must be missing something obvious. Any tips?

Related

Rebuild tables from joined table

I am facing an issue where a data supplier is generating a dump of his multi-tenant databases in a single table. Recreating the original tables is not impossible, the problem is I am receiving millions of rows every day. Recreating everything, every day, is out of question.
Until now, I was using SSIS to do so, with a lookup-intensive approach. In the past year, my virtual machine went from having 2 GB of ram to 128, and still growing.
Let me explain the disgrace:
Imagine a database where users have posts, and posts have comments. In my real scenario, I am talking about 7 distinct tables. Analyzing a few rows, I have the following:
+-----+------+------+--------+------+-----------+------+----------------+
| Id* | T_Id | U_Id | U_Name | P_Id | P_Content | C_Id | C_Content |
+-----+------+------+--------+------+-----------+------+----------------+
| 1 | 1 | 1 | john | 1 | hello | 1 | hello answer 1 |
| 2 | 1 | 2 | maria | 2 | cake | 2 | cake answer 1 |
| 3 | 2 | 1 | pablo | 1 | hello | 1 | hello answer 3 |
| 4 | 2 | 1 | pablo | 2 | hello | 2 | hello answer 2 |
| 5 | 1 | 1 | john | 3 | nosql | 3 | nosql answer 1 |
+-----+------+------+--------+------+-----------+------+----------------+
the Id is from my table
T_Id is the "tenant" Id, which identifies multiple databases
I have imagined the following possible solution:
I make a query that selects non-existent Ids for each table, such as:
SELECT DISTINCT n.t_id,
n.c_id,
n.c_content
FROM mytable n
WHERE n.id > 4
AND NOT EXISTS (SELECT 1
FROM mytable o
WHERE o.id <= 4
AND n.t_id = o.t_id
AND n.c_id = o.c_id)
This way, I am able to select only the new occurrences whenever a new Id of a table is found. Although it works, it may perform badly when working with 100s of millions of rows.
Could anyone share a suggestion? I am quite lost.
Thanks in advance.
EDIT > my question is vague
My final intent is to rebuild the tables from the dump, incrementally, avoiding lookups outside the database. Every now and then I am gonna run a script that will select new tenants, users, posts and comments and add them to their corresponding tables.
My previous solution worked as follows:
Cache the whole database
For each new row, search for the columns inside the cache
If it doesn't exist, then insert it
I know it sounds dumb, but it made sense as a new developer working with ETLs
First, if you have a full flat DB dump, I'll suggest you to work on your file before even importing it in your DB (low level file processing is pretty cheap and nearly instantaneous).
From Removing lines in one file that are present in another file using python you can remove all the already parsed line since your last run.
with open('new.csv','r') as source:
lines_src = source.readlines()
with open('old.csv','r') as f:
lines_f = f.readlines()
destination = open('diff_add.csv',"w")
for data in lines_src:
if data not in lines_f:
destination.write(data)
destination.close()
This take less than five second to work on a 900Mo => 1.2Go dump. With this you'll only work with line that really make change in one of your new table.
Now you can import this flat DB to a working table.
As you'll have to search the needle in each line, some index on the ids may by a good idea (go to composite index that use your Tenant_id first).
For the last part, I don't know exactly how your data look, can you have some update to do ?
The Operators - EXCEPT and INTERSECT can help you too with this kind of problem.

How can I link my Junction table to my main table

I have a SQL database with the main table called Results. This table stores a record of results of tests that are run nightly.
The Results table has many fields but for arguments say lets just say for now it looks like this:
ResultID (Unique key field generated upon insert)
Result (nvchar10)
What I wanted to be able to record was a list of tags used in the tests that were run. The tags may be different for each result and an array of them are stored.
I created a junction table as shown below called Tags:
TagID (int key field unique generated at runtime)
ResultID (int)
ScenarioTag (nvchar128)
FeatureTag (nvchar128)
So what im looking to do is to link these 2 together. I'm not so great with databases ill be honest.
I was thinking that when I save the test results with my normal SQL query immediately after I would loop through each tag and save the tags to this new table but maybe i'm wrong here?
Psuedocode:
//Returned from previous SQL statement that inserted results values into the DB
int ResultID = SQLQueryReturnValue;
Foreach TAG in TAGS
{
string SQLQuery = "INSERT INTO TAGS (ResultID, ScenarioTag, FeatureTag)(#ResultID, #ScenarioTag, #FeatureTag)";
CmdSql.Parameters.AddWithValue("#ResultID", ResultID);
CmdSql.Parameters.AddWithValue("#ScenarioTag", TAG.Scenario);
CmdSql.Parameters.AddWithValue("#FeatureTag", TAG.Feature);
CmdSql.CommandText = SQLQuery;
CmdSql.Execute();
}
Heres an example of what each table might actually look like:
Results Table
|ResultID | Result |
| 10032 | Pass |
| 10031 | Fail |
| 10030 | Fail |
Tags Table
| TagID | ResultID | ScenarioTag | FeatureTag |
| 6 | 10032 | Cheque | Trading |
| 5 | 10032 | GBP | Sales |
| 4 | 10031 | Direct Credit | Trading |
| 3 | 10031 | GBP | Purchase |
| 2 | 10030 | Wire | Dividends |
| 1 | 10030 | USD | Payments |
So finally onto my question...Is there a way that I can physically link this new "Tags" table to my results table. Its informally linked in a way using the ResultID but theres no physical link.
Is it this you're looking for? (Assumption: This query is looking from results. They do not necessarily have to have Tags...)
SELECT *
FROM Results
LEFT JOIN Tags ON Results.ResultID=Tags.ResultID
EDIT: Maybe I did not understand, what you mean by "physically". You could add a foreign key constraint:
ALTER TABLE Tags ADD CONSTRAINT FK_Tags_Results FOREIGN KEY (ResultID) REFERENCES Results(ResultID);
This constraint adds a relation to these tables, making sure, that only values existing in Results are allowed in Tags as "ResultID". On the other hand you cannot delete a Result row with existing children in Tags...
If you do this you could alter the top query to:
SELECT *
FROM Tags
INNER JOIN Results ON Results.ResultID=Tags.ResultID
Now you are looking from Tags (leading table) and you know, that each tag must have a ResultID (INNER JOIN).

How to select with bitwise flag values in SQL

I have two tables in a SQL Server DB. One table BusinessOperations has various information about this business object, the other table OperationType is purely a bitwise flag table that looks like this:
| ID | Type | BitFlag |
| 1 | Basic-A | -2 |
| 2 | Basic | -1 |
| 3 | Type A | 0001 |
| 4 | Type B | 0002 |
| 5 | Type C | 0004 |
| 6 | Type D | 0008 |
| 7 | Type E | 0016 |
| 8 | Type F | 0032 |
The BitFlag column is a varchar column, the bitflags were inserted as '0001' as an example. In the BusinessOperations table, there's a column where the application that uses these tables updates it based on what is selected in the application's UI. As an example, I have one type which has the Basic,Type A, and Type B types selected. The column value in BusinessOperations is 3.
Based on this, I am trying to write a query which will show me something like this:
| ID | Name | Description | OperationType |
| 1 | Test | Test | Basic, Type A, Type B |
Here is the actual layout of the BusinessOperations table (Basic-A and Basic are bit columns:
| ID | Name | Description | Basic-A | Basic | OperationType |
| 1 | Test | Test | 0 | 1 | 3 |
There is nothing that relates these two tables to each other, so I cannot perform a join. I am very inexperienced with bitwise operations and am at a loss on how exactly to structure my select query which is being used to analyze this data. I feel like it needs a STUFF or CASE, but I don't know how I can get this to just show the types and not just the resultant BitFlag.
SELECT ID, Name, Description, OperationType
FROM OperationType
ORDER BY ID
Since you're storing the flag in OperationType as a VARCHAR, the first thing you need to do to is CONVERT or CAST the string to a number so we can do proper bitwise comparisons. I'm slightly unfamiliar with SQL Server, but you may need to remove the leading zeroes before the cast. Thus, the OperationType column in our desired SQL will look something like
CONVERT(INT, BitFlag)
Then, comparing that to our OperationType column would look something like
CONVERT(INT, BitFlag) & OperationType
The full query would look something like (forgive my lack of SQL Server expertise again):
SELECT bo.ID, bo.Name, bo.Description, ot.Type
FROM BusinessOperations AS bo
JOIN OperationType AS ot
ON CONVERT(INT, ot.BitFlag) & OperationType <> 0
The above query will effectively get you a list of the OperationTypes. If you absolutely need them on one line, see other answers to learn how to emulate something like GROUP_CONCAT in SQL Server. Disclaimer: Joining on a bitmask gives no guarantee of performance.
The other problem this answer does not solve is that of your legacy Basic and Basic-A fields. Personally, I'd do one of two things:
Remove them from the OperationType table and have the application tack the two on, based on the Basic and Basic-A columns as appropriate.
Put Basic and Basic-A as their own, positive flags in the OperationType table, and have the application populate the legacy columns as well as the OperationType column as appropriate.
As Aaron Bertrand has said in the comments, this really isn't an issue for Bitmasking at all. Having a many-many table that associates BusinessOperations.ID to OperationType.ID would solve all your problems in a much better way.
In the BusinessOperations table the Basic-A and Basic field are bit fields which is just another way of saying the value can only be a 1 or 0. Think of it like a boolean value True/False. So, in your query you can check each of those to determine whether to include 'Basic-A' and 'Basic' or not.
The OperationType is probably an id which you can lookup in the OperationsType table to get the Type and BitFlag. Without understanding your data completely it looks as if you could do a join for that part. Hopefully that is in the right general direction. If not, let me know.

MS access 2007 - checklist options(multiple) to be stored in a column of the database

I have a situation like - the customer form in MS access 2007 have list of documents provided by customers. The list is in the checklist format. Assuming there are 6 documents under the checklist. So if the one or more checklists are selected, all the selected list should be saved in the database column named "Documents_Provided". So in order to achieve this scenario what should I have to do. How should my database field "Documents_provided" should be declared and what do I have to write in VBA code.
As per your Question heading suggests "Multiple to be stored in a Column of the database" is a very bad table design, it breaks one of the rules of Fundamentals of Database Design, Data should be atomic.
The system you should be having is a One to Many, between the Customer and Document table. The Customer table will normally have the basic customer information; one side of the relationship, and the Documents table will have all the documents that pertain to each Customer; many side of the relationship. In Addition you will have another table Document Category that will say what are all the documents that needs/can have for each customer. So sample data in your table will be something like,
tbl_Customers
`````````````
ID | customerName | customerArea
----+-------------------+------------------
1 | Paul | Bournemouth
2 | Eugin | Bristol
3 | Francis | London
tbl_DocumentsCategory
`````````````````````
ID | DocumentName
----+---------------------------
1 | Address Proof
2 | Photo ID
3 | Employer Certificate
tbl_CustomersDocument
`````````````````````
ID | CustomerID | DocumentID
----+---------------+--------------
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 2 | 1
5 | 2 | 3
6 | 3 | 2
So when you need to get the list of Documents each Customer has, you simply JOIN the two tables to get the right information. This is the standard and efficient way to organize the data. I hope this helps, and you stick to this.

SQL: Creating a common table from multiple similar tables

I have multiple databases on a server, each with a large table where most rows are identical across all databases. I'd like to move this table to a shared database and then have an override table in each application database which has the differences between the shared table and the original table.
The aim is to make updating and distributing the data easier as well as keeping database sizes down.
Problem constraints
The table is a hierarchical data store with date based validity.
table DATA (
ID int primary key,
CODE nvarchar,
PARENT_ID int foreign key references DATA(ID),
END_DATE datetime,
...
)
Each unique CODE in DATA may have a number of rows, but at most a single row where END_DATE is null or greater than the current time (a single valid row per CODE). New references are only made to valid rows.
Updating the shared database should not require anything to be run in application databases. This means any override tables are final once they have been generated.
Existing references to DATA.ID must point to the same CODE, but other columns do not need to be the same. This means any current rows can be invalidated if necessary and multiple occurrences of the same CODE may be combined.
PARENT_ID references must have same parent CODE before and after the split. The actual PARENT_ID value may change if necessary.
The shared table is updated regularly from an external source and these updates need to be reflected in each database's DATA. CODEs that do not appear in the external source can be thought of as invalid, new references to these will not be added.
Existing functionality will continue to use DATA, so the new view (or alternative) must be transparent. It may, however, contain more rows than the original provided earlier constraints are met.
New functionality will use the shared table directly.
Select performance is a concern, insert/update/delete is not.
The solution needs to support SQL Server 2008 R2.
Possible solution
-- in a single shared DB
DATA_SHARED (table)
-- in each app DB
DATA_SHARED (synonym to DATA_SHARED in shared DB)
DATA_OVERRIDE (table)
DATA (view of DATA_SHARED and DATA_OVERRIDE)
Take an existing DATA table to become DATA_SHARED.
Exclude IDs with more than one possible CODE so only rows common across all databases remain. These missing rows will be added back once the data is updated the first time.
Unfortunately every DATA_OVERRIDE will need all rows that differ in any table, not only rows that differ between DATA_SHARED and the previous DATA. There are several IDs that differ only in a single database, this causes all other databases to inflate. Ideas?
This solution causes DATA_SHARED to have a discontinuous ID space. It's a mild annoyance rather than a major issue, but worth noting.
edit: I should be able to keep all of the rows in DATA_SHARED, just invalidate them, then I only need to store differing rows in DATA_OVERRIDE.
I can't think of any situations where PARENT_ID references become invalid, thoughts?
Before:
DB1.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | A1 | 1 | 2020
3 | A2 | 1 | 2010
DB2.DATA
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
2 | X | NULL | NULL
3 | A2 | 1 | 2010
4 | X1 | 2 | NULL
5 | A1 | 1 | 2020
After initial processing (DATA_SHARED created from DB1.DATA):
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
-- END_DATE is omitted from DATA_OVERRIDE as every row is implicitly invalid
DB1.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | A1 | 1
DB2.DATA_OVERRIDE
ID | CODE | PARENT_ID
2 | X |
4 | X1 | 2
5 | A1 | 1
After update from external data where A1 exists in source but X and X1 don't:
SHARED.DATA_SHARED
ID | CODE | PARENT_ID | END_DATE
1 | A | NULL | NULL
3 | A2 | 1 | 2010
6 | A1 | 1 | 2020
edit: The DATA view would be something like:
select D.ID, ...
from DATA D
left join DATA_OVERRIDE O on D.ID = O.ID
where O.ID is null
union all
select ID, ...
from DATA_OVERRIDE
order by ID
Given the small number of rows in DATA_OVERRIDE, performance is good enough.
Alternatives
I also considered an approach where instead of DATA_SHARED sharing IDs with the original DATA, there would be mapping tables to link DATA.IDs to DATA_SHARED.IDs. This would mean DATA_SHARED would have a much cleaner ID-space and there could be less data duplication, but the DATA view would require some fairly heavy joins. The additional complexity is also a significant negative.
Conclusion
Thank you for your time if you made it all the way to the end, this question ended up quite long as I was thinking it through as I wrote it. Any suggestions or comments would be appreciated.