Correct framework for different attribute values in Dimension - sql

I am attempting to build a Customer dimension from about 5 of our underlying systems.I have a situation where each system has varying degrees of data quality and as a result I end up with different values for the same attributes across rows.
I need to merge the rows into a single row (as it is the same customer and need to be able to merge and make a golden record)
I am running MS SQL Server 2019.
So my data looks something like this (from source system):
ClientID, PhoneNumber, Address, SourceSystemID
123, 63310042, '123 Test Street, Test Town, 2800,', SAP
123,+61310042,'123 test st, Test town 2800', Netsuite
I am wondering what the best apporach here is because I want to:
ensure that transaction from source system SAP, and Netsuite are shown whenever I view this customer.
I also want the analyse geographical information like. - they are from 'Test Town'

Related

Databases design and primary key composed

I have a table named minibar_bill and i use it for keeping evidence of client's expenditure. I'm trying to build a hotel/pension system management.
I thought that i could make a table
Minibar_bill with (id_bill, id_minibar_product, id_client)
And i would like to add those info on an invoice based on bill_id...
How should i do it ?
I mean i want to have something like that:
Id_bill(1)
id_minibar_product(1,2,3)
id_client(123)
So first 3 records will be :
1, 1, 123
1, 2, 123
1, 3, 123
And i want the id_bill to be on invoice ... maybe i could switch id_product with id_bill
Where id_bill(1) - would be the first bill record in database
id_minibar_product(1,2,3) - would be product 1,2,3 which has been consumed by client
id_client(123) - client id which we use on invoice to collect data from Client table in order to print them on invoice( i will use C# for UI ).
What I have tried:
I've tried to make a db with field id_bill and id_product but i think it's a wrong approach since i made them a composed primary key and i cannot add them to foreign key in Invoice table.
Here are some suggestions for your design:
It's a good idea to name things descriptively, but if you create a table called Minibar_bill, that's going to be inconsistent and short sighted if you want to start charging in-room movies and in-room dining, services etc. to the room. I suggest you call it something more generic - remove Minibar from all of your table names.
You must never put comma separated values into a single field.
There are a million sales data models online, including, as already suggested, templates in MS Access. There's no point reinventing the wheel
I suggest you have something like this
Client A list of clients
Products A list of products you can be billed for (not just minibar)
Bill A client has zero or more bills (usually one)
BillLine A bill has zero ore more lines. Each line represents
One product being charged for on a bll
So Bill is the header. It's up to you whether you add a column indicating when / if it is invoiced, paid etc., or whether you want to create a seperate invoicing module.
With regards to this comment:
What i wish for is to link Invoice to minibar_bill in order to have the status on a single Invoice of all products from minibar which have been bought by a customer.
If you have a seperate invoice table you can write the BillID to it to link it.
I'm not sure if you understand that all this info exists across different tables, and when, for example, you print an invoice, you go and collect all the info from across the tables at that time.

Write query to show all unique occurences numbered, and with variants listed

I am working on past national censuses stored in an Oracle database. My main tools for working with it, are MS Access and LibreOffice Base, depending on what kind of task I have to solve. I do not have direct access to the dbase; I cannot, for instance, run update queries directly on the main tables, but I can do this on subtables I have created in my environment.
I would like to list all unique standardised names from a census, with the number of instances shown as a count, and listing all variants of the name in a seperate column. How would such a query be written?
In the example below, the …S following Firstname, indicates which standard name the source’s first name is encoded under.
Firstname FirstnameS
Tor Tor
Thor Tor
Per Per
Peer Per
Pær Per
Pär Per
Caroline Karoline
Charoline Karoline
Karoliine Karoline
Desired output
FirstnameS Σ Firstname_variants
Tor 2 Tor, Thor
Per 4 Per, Peer, Pær, Pär
Karoline 3 Caroline, Charoline, Karoliine
───
I hope I’ve provided all information and asked the question in a manner befitting the RoC of Stackoverflow. Be gentle; it’s my first question!
SELECT FirstnameS, COUNT(Firstname) AS Num
FROM myTable
GROUP BY FirstnameS
gives you the first two columns.
The third depends on the database system - can you run Oracle queries (directly or pass-through)?
Edit:
Oracle: SQL Query to concatenate column values from multiple rows in Oracle
MS-Access: Combine values from related rows into a single concatenated string value

Correcting Specific Values in SSIS Package.

I am building a report for one of our departments that counts software licenses by cost center. The problem I have is we have our upper level management that is in a specific cost center for organizational purposes (which the license system grabs) but the department requesting the report needs to have the cost center that the managers expense everything to instead.
this effects about 15 entries but the report pulls over 300, so I only need to correct the 15 without impacting the rest.
I have created a table labeled [dbo].[CostCenter_corrections] with two columns in it [UserID] (nvarchar, this is the user name not an employee ID) and [CostCenter_Correction] (int, this is their expense cost center).
What I want to do is either set up a method at the end of the Staging Load that will correct these numbers by the UserID, like an executeSQL Task, or build another SSIS package that will process the Staging Data and reload it into the same staging table (not sure if that is possible or even something that should be done).
If you can think of any other way I'm open to ideas.
Thank you in advance for any help.
you may need to set the lookup to "Redirect rows to no match output" (you can do this from the genaral tab) and then you will have two outputs from the lookup, one for matched rows and one for not matched. Then you can do the work you need and union the two pipelines back. Your DF will look like this:

How to autofill table column in MS Access?

I am trying to develop a simple database that stores taking information for a taxi daily figure etc. and there are some calculations that I would like to have auto-filled from basic information supplied by the driver, such as:
gross takings given a start and end value from a metering device km's
driven given an odometer reading driver owner split given the days takings
The problem I have is I would like to store all these values in a single attribute to make retrieval and entry into another third party system easier. Currently this is paper based and I am trying to digitize this process
The operations are simple mathematical expressions such as addition subtraction and percentage split (multiplication or division)
I've tried various sql commands like
INSERT INTO table (fieldname)
select
table.feildname1, table.feildname2, [feildname2]-[fieldname1]
from
table
I will be using a input form for data entry that will display the basic data input and a drivers share of takings/expenses based upon these calculations
And I'm drawing a blank I'm using ms access 2007
You can do:
INSERT INTO table (fieldname)
SELECT CStr(table.feildname1) & CStr(table.feildname2) & CStr([feildname2]-[fieldname1])
FROM table;
But as #Tarik said, it is not recommended to store all fields in one column, unless it is some temp table or just for view.

SQL IN statement "inclusiveness"

I'm not a programmer, but trying to learn. I'm a nurse, and need to pull data for medical referral tracking from a database. I have a piece of GUI software which builds JOIN queries for me to pull things from the database. One of the operators I can use in the drop-down is "IN." The referral documentation is stored in the table as codes made up of one to three letters. For example, the code for a completed dental referral is CDF, and the code for a dental referral is D.
I want to build a report to allow other nurses to pull all their outstanding referrals, so I'll want to pull "D" but not "CDF"
If I use IN as the operator, and set my parameters to 'S','D','BP' {etc} will that also pull the records which have the other, longer codes which contain those same letters? (like CDF, CSR, CBP)
I don't want to test it because I only have access to the production database, and I don't want to hose up actual patient records. Thanks in advance for any help!
Assuming that the column that holds the referral code holds one and only one code per record (which is what it sounds like) the query should function as you want and will not attempt to match substrings.
In any event, there's no danger that a query in the form IN ('S', 'D', 'BP') will match substrings. To perform substring matches in SQL you have to use the LIKE operator.
The situation in which this will not work is if the referral code column holds multiple codes separated by commas. This is an all-too-common mistake in designing databases but if the product you're using is commercial rather than home-grown, I think it's very unlikely to be the case. If it is, searching it is much more difficult.