How to import Excel table with double headers into oracle database - sql

I have this excel table I am trying to transfer over to an oracle database. The thing is that the table has headers that overlap and I'm not sure if there is a way to import this nicely into an Oracle Database.
+-----+-----------+-----------+
| | 2018-01-01| 2018-01-02|
|Item +-----+-----+-----+-----+
| | RMB | USD | RMB | USD |
+-----+-----+-----+-----+-----+
| | | | | |
+-----+-----+-----+-----+-----+
| | | | | |
+-----+-----+-----+-----+-----+
| | | | | |
+-----+-----+-----+-----+-----+
| | | | | |
+-----+-----+-----+-----+-----+
The top headers are just the dates for the month and then their respective data for that date. Is there a way to nicely transfer this to an oracle table?
EDIT: Date field is an actual date such as 02/19/2018.

If you pre-create a table (as I do), then you can start loading from the 3rd line (i.e. skip the first two), putting every Excel column into the appropriate Oracle table column.
Alternatively (& obviously), rename column headers so that file wouldn't have two header levels).

Related

SQL- Query to determine if any NULLS in a column

I need to create a query to determine if any records in a table contain NULLS in the EXPORTED column. If so then it will execute a stored procedure. I tried searching and I could not find anything that queries just one specific column. Below is an example of my table.
+--------+-----------+----------+
| SAMPLE | DATE | EXPORTED |
+--------+-----------+----------+
| S1234 | 9/17/2019 | NULL |
| S1435 | 9/17/2019 | NULL |
| S1536 | 9/17/2019 | YES |
+--------+-----------+----------+

SQL Query to look up one table against another

I have two Excel tables- the first one is the data table and the second one is a look up table. Here is how they are structured-
Data Table
+----------+-------------+----------+----------+
| Category | Subcategory | Division | Business |
+----------+-------------+----------+----------+
| A | Red | Home | Q |
| B | Blue | Office | R |
| C | Green | City | S |
| D | Yellow | State | T |
| D | Red | State | T |
| D | Green | Office | Q |
+----------+-------------+----------+----------+
Lookup Table Lookup Table
+----------+-------------+----------+----------+--------------+
| Category | Subcategory | Division | Business | LookUp Value |
+----------+-------------+----------+----------+--------------+
| 0 | 0 | 0 | Q | ABC |
| B | 0 | Office | 0 | DEF |
| C | Green | 0 | 0 | MNO |
| D | 0 | State | T | RST |
+----------+-------------+----------+----------+--------------+
So I want to add the lookup value column to the data table based on the criteria given in the lookup table. Eg, for the first row in the lookup table, I dont want to lookup on Category, Subcategory, or Division. but if the Business is Q, then I want to populate the lookup value as ABC. Similarly, for the second row I dont want to consider the Subcategory. and Business. but if the Category. is "B" and Division is "Office", I want it to populate DEF. So the result should look like this-
[Final Resulting Data Table]
+----------+-------------+----------+----------+--------------+
| Category | Subcategory | Division | Business | LookUp Value |
+----------+-------------+----------+----------+--------------+
| A | Red | Home | Q | ABC |
| B | Blue | Office | R | DEF |
| C | Green | City | S | MNO |
| D | Yellow | State | T | RST |
| D | Red | State | T | RST |
| D | Green | Office | Q | ABC |
+----------+-------------+----------+----------+--------------+
I am very new to SQL and the actual data set is very complex wih multiple lookup values based on different criteria. IF you think any other scripting language would work better, I am open to that too. My data is in Excel currently
If the data is so complex, you should first consider if you want to put it in a (relational) database (like MS Access, MySQL, etc.) instead of in a spreadsheet (like MS Excel).
Both kind of programs are used for structured data handling, but databases focus primarily on efficient data storage and data integrity (including guarding type safety, required fields, unique fields, required references between various datasets/tables, etc.) and spreadsheets focus primarily on data analysis and calculations.
Relational databases support Structured Query Language (SQL) to let clients query their data. Spreadsheets normally do not use or support SQL (as far as I know).
It is possible to let MS Excel import or reference data in an external data source (like a relational database) to perform analysis and calculations on it.
The other way around is (sometimes) possible too: to link to spreadsheet worksheets as external tables inside a relational database system to - within certain limits - allow that data to be queried using SQL. But using a database to store the data and a spreadsheet (as a database client) to perform analysis on the data in the database would be a more logical design in my opinion.
However, creating such an integrated solution using multiple MS Office applications and/or external databases can be a complex challenge, especially when you are just starting to learn about them.
To be honest, I am not experienced with designing MS Office based solutions, so I cannot guide you around any pitfalls. I do hope, that this answer helps you a little with finding the right way to go here...

sqlite3: Create tables with many rows or one table with more columns

I'm asking for a best practive when creating some tables for localization of an Web Interface in sqlite3
In my first intetion I wanted to create a table with the different languages, and another on for the Messeage Code and Entries.
tblLanguage
+------------+-------------+---------+
| idLangCode | txtLangName | txtCode |
+------------+-------------+---------+
| 1 | English | en |
| 2 | German | de |
| 3 | French | fr |
| 4 | Spanish | es |
| 5 | Chinese | zh |
+------------+-------------+---------+
tblMessageText
+----+-------+--------------------------+------------+
| Id | Code | Message | LanguageID |
+----+-------+--------------------------+------------+
| 1 | 20500 | Set Point changed | 1 |
| 2 | 20500 | Sollwert geändert | 2 |
| 3 | 20500 | Punto de ajuste cambiado | 5 |
+----+-------+--------------------------+------------+
So in the second table I would have several rows with the same Message Code but whith an different language text.
The other possibility would be to have just one table with just one row for each Message Code but an Column for each language.
tblMessageTextMulti
+----+-------+-------------------+-------------------+--------------------------+
| id | Code | txtMessageText_EN | txtMessageText_DE | txtMessageText_ES |
+----+-------+-------------------+-------------------+--------------------------+
| 1 | 20500 | Set Point changed | Sollwert geändert | Punto de ajuste cambiado |
+----+-------+-------------------+-------------------+--------------------------+
My team likes the second solution with just one table more, because it just has one entry for each Message Code, and you see all Language text side by side.
What I like on the first solution is, that I could dynmically Query the langugage with just on line in php:
$query = 'SELECT * FROM qryInfoMessage WHERE idLangCode=' .$Language;
For the second solution with one table I can not store the query itself in the database, because I have to change the Column name in my query dynmically. So I have to put this together in php.
$query = 'SELECT Code, txtMessageText_EN FROM tblMessageTextMulti;
What I dont show here is that my query is much more complex, whith string substitution and date time conversion.
Beside that, what are the advantages or disadvantages of this solutions. Which one should be more perfomant and what is the best practice?

How should I create an SQL table with stock information so that I can add new stocks and new fields easily?

I want to create an SQL table, where I can have any number of stocks (ie. MSFT, GOOG, IBM) and any number of fields (ie. Full Name, Sector, Country). But I want the flexibility to add new stocks and new fields as I go along. Say I want to add a new stock like AAPL, or I want a new boolean field for whether they pay dividends or not. I don't expect to store dynamic fields like CurrentStockPrice, but the information will have to change periodically. For instance, when a company changes its dividend policy. How do I design the table so that I don't have to change its structure?
I had one idea where I could have a new table for each stock, and a master table that has all the stocks, and a pointer to each individual stock's table. That way, I can freely add new stocks, and new fields easily. But I'm not very familiar with SQL, and would like an expert opinion on how it should be implemented.
The simple answer is that your requirements are not a good fit for SQL. The most important concern is not how to store the data, but how you will retrieve it - what kind of query will you need to run?
EAV allows you to store data whose schema you don't know in advance - but has lots of drawbacks when querying. Even moderately complex queries (find all stocks where the dividend was paid between 1 and 12 Jan, in the tech sector, whose CEO is female) run into a lot of compexity.
Creating a new table for each type of record very quickly gets crazy too - imagine the query above if you have to search dozens or hundreds of type-specific tables.
The relational model works best when you know the schema of the information in advance.
If you don't know the schema, consider using a NoSQL solution, or use SQL Server's support for XML or JSON. Store the fixed data in rows & columns, and the variable data in XML or JSON. Performance for searching is pretty good, and it's much less convoluted as a solution.
Just to expand on my comment, because the question itself begs for a couple of common schema anti-patterns. Some hybrid of EAV may actually be a good fit if you are willing to give up some flexibility and simplicity in your SQL and you aren't looking for fast queries.
EAV
EAV, or Entity-Attribute-Value is a design where, in your case, you would have a master table of stocks with some common attributes, or maybe even ticker info with a datetime. Something like:
+---------+--------+--------------+
| stockid | symbol | name |
+---------+--------+--------------+
| 1 | goog | Google |
| 2 | msft | Microsoft |
| 3 | gpro | GoPro |
| 4 | xom | Exxon Mobile |
+---------+--------+--------------+
And a second table (the EAV table) to store ever changing attributes:
+---------+-----------+------------+
| stockid | attribute | value |
+---------+-----------+------------+
| 1 | country | us |
| 1 | favorite | TRUE |
| 1 | startyear | 2004 |
| 3 | favorite | |
| 3 | bobspick | TRUE |
| 4 | country | us |
| 3 | country | us |
| 2 | startyear | 1986 |
| 2 | employees | 18000 |
| 3 | marketcap | 1850000000 |
+---------+-----------+------------+
And perhaps a third table to get that minute by minute ticker info stored:
+---------+----------------+--------+
| stockid | datetime | value |
+---------+----------------+--------+
| 1 | 9/21/2016 8:15 | 771.41 |
| 1 | 9/21/2016 8:14 | 771.39 |
| 1 | 9/21/2016 8:12 | 771.37 |
| 1 | 9/21/2016 8:10 | 771.35 |
| 1 | 9/21/2016 8:08 | 771.33 |
| 1 | 9/21/2016 8:06 | 771.31 |
| 1 | 9/21/2016 8:04 | 771.29 |
| 2 | 9/21/2016 8:15 | 56.81 |
| 2 | 9/21/2016 8:14 | 56.82 |
| 2 | 9/21/2016 8:12 | 56.83 |
| 2 | 9/21/2016 8:10 | 56.84 |
+---------+----------------+--------+
Generally this is considered not great design since stitching data back together in a format like:
+-------------+-----------+---------+-----------+----------+--------------+
| stocksymbol | stockname | country | startyear | bobspick | currentvalue |
+-------------+-----------+---------+-----------+----------+--------------+
causes you to write a query that is not fun to look at:
SELECT
stocks.stocksymbol,
stocks.name,
country.value,
bobspick.value,
startyear.value,
stockvalue.stockvalue
FROM
stocks
LEFT OUTER JOIN (SELECT stockid, value FROM fieldsTable WHERE attribute = 'country') as country ON
stocks.stockid = country.stockid
LEFT OUTER JOIN (SELECT stockid, value FROM fieldsTable WHERE attribute = 'Bobspick') as bobspick ON
stocks.stockid = bobspick.stockid
LEFT OUTER JOIN (SELECT stockid, value FROM fieldsTable WHERE attribute = 'startyear') as startyear ON
stocks.stockid = startyear.stockid
LEFT OUTER JOIN (SELECT max(value) as stockvalue, stockid FROM ticketTable GROUP BY stockid) as stockvalue ON
stocks.stockid = stockvalue.stockid
WHERE symbol in ('goog', 'msft')
You can see that every "field" in the EAV table gets its own subquery, which means we read that table from storage three times. We gain the flexibility on the front end over the database design, but we lose flexibility when querying.
Imagine a more traditional schema:
+---------+--------+--------------+---------+----------+----------+-----------+------------+-----------+
| stockid | symbol | name | country | bobspick | favorite | startyear | marketcap | employees |
+---------+--------+--------------+---------+----------+----------+-----------+------------+-----------+
| 1 | goog | Google | us | | TRUE | 2004 | | |
| 2 | msft | Microsoft | | | | 1986 | | 18000 |
| 3 | gpro | GoPro | us | TRUE | | | 1850000000 | |
| 4 | xom | Exxon Mobile | us | | | | | |
| | | | | | | | | |
+---------+--------+--------------+---------+----------+----------+-----------+------------+-----------+
and
+---------+----------------+--------+
| stockid | datetime | value |
+---------+----------------+--------+
| 1 | 9/21/2016 8:15 | 771.41 |
| 1 | 9/21/2016 8:14 | 771.39 |
| 1 | 9/21/2016 8:12 | 771.37 |
| 1 | 9/21/2016 8:10 | 771.35 |
| 1 | 9/21/2016 8:08 | 771.33 |
| 1 | 9/21/2016 8:06 | 771.31 |
| 1 | 9/21/2016 8:04 | 771.29 |
| 2 | 9/21/2016 8:15 | 56.81 |
| 2 | 9/21/2016 8:14 | 56.82 |
| 2 | 9/21/2016 8:12 | 56.83 |
| 2 | 9/21/2016 8:10 | 56.84 |
+---------+----------------+--------+
To get the same results:
SELECT
stocks.stocksymbol,
stocks.name,
stocks.country,
stocks.bobspick,
stocks.startyear,
stockvalue.stockvalue
FROM
stocks
LEFT OUTER JOIN (SELECT max(value) as stockvalue, stockid FROM ticketTable GROUP BY stockid) as stockvalue ON
stocks.stockid = stockvalue.stockid
WHERE symbol in ('goog', 'msft')
Now we have the flexibility in the query where we can quickly change out fields without monkeying around in subqueries, but we have to hassle our DBA every time we want to add a field.
There is a further abstraction from EAV that is definitely something to avoid. I don't know if it has a name, but I call it "Database in a database". Here you have a table of tables, table of fields, and a table of values. The entire schema is kept as records as our the values that would be stored in the schema. Ultimatele flexibility is gained, but the sql you will write to get at your data will be nightmarish and your query speeds will degrade at a fast rate as you add to your data/schema/data/schema mess.
As for your last idea of adding a new table for each stock, if the fields you are going to track for each stock are different (startyear, employees, and market cap for one stock and marketmax, country, address, yearsinbusiness in another) and you aren't planning on adding new stocks often, then it may be a good fit. I'm betting though that the attributes/fields that you track on stock1 are going to also be tracked on stock2, and therefore suggest that your should have a single stock table with all those common attributes and maybe an EAV to track attributes that are particular to each stock so you can have the flexibility you need.
In each of these schemas I would also suggest that you put your ticker data in it's own table. Whether you are capturing ticket data by the minute, hour, day, week, or month, because it's datetime level data, it deserves it's own table. (Unless you are only going to track the most current value, then it becomes a field).
If you want to add fields dynamically, but without actually altering the schema of the table, then you should use a vertical schema for the table and retrieve the data via a PIVOT statement.
In this manner you can add as many Field/Value pairs as you wish for each stock/customer pairing.
The basic table would have 5 columns perhaps:
ID (Identity); StockName; AttributeName; Value; Timestamp;
If you take a look at how SQL organizes it's table schema in INFORMATION_SCHEMA.COLUMNS, it provides this very same vertical schema layout for you.

Excel VBA to transpose set of rows if value exists in another column

I'm trying to find a way via VB script that will transpose rows from column A into a new sheet but only if there is a value in column B for rows that contain numbers. I have a sheet with ~75K rows on it that I need to do this for, and I tried creating pivot tables which allowed me to get the data into its current format but I need the data to be in columns.
The tricky part of this is that in column A, I only need to look at the rows that are all numbers and not the other rows that have text.
I created a sample sheet to view, where the sample data is in the SOURCE tab and what I want the data to look like in the TRANSPOSED tab.
https://docs.google.com/spreadsheets/d/1ujbaouZFqiPw0DbO78PCnz25OY2ugF1HtUqMg_J7KeI/edit?usp=sharing
Any help would be appreciated.
UPDATE and Answer:
I modified my approach and went back to the original source data which was not part of a pivot table and was able to use a simple match formula between the 2 data sources. So, my original data looked like this:
+----------------+---------+--------+--------------+
| Gtin | Brand | Name | TaxonomyText |
+----------------+---------+--------+--------------+
| 00030085075605 | brand 1 | name 1 | cat1 |
| 00041100015112 | brand 2 | name 2 | cat2 |
| 00041100015099 | brand 3 | name 3 | cat3 |
| 00030085075608 | brand 4 | name 4 | cat4 |
+----------------+---------+--------+--------------+
I had another sheet containing the data I needed to match to in this format:
+----------------+---------+
| Gtin | Brand |
+----------------+---------+
| 00030085075605 | brand 1 |
| 00041100015112 | brand 2 |
| 00041100015098 | brand 3 |
| 00030085075608 | brand 4 |
+----------------+---------+
I created a new column in my source sheet and used a if error match formula:
=IFERROR(IF(MATCH(A14,data_to_match!$A:$A,0),"yes",),"no")
Then copied this formula down for every row, about 75K rows which very quickly added a yes or a no.
+----------------+---------+---------+--------+--------------+
| Gtin | matched | Brand | Name | TaxonomyText |
+----------------+---------+---------+--------+--------------+
| 00030085075605 | yes | brand 1 | name 1 | cat1 |
| 00041100015112 | yes | brand 2 | name 2 | cat2 |
| 00041100015098 | no | brand 3 | name 3 | cat3 |
| 00030085075608 | yes | brand 4 | name 4 | cat4 |
+----------------+---------+---------+--------+--------------+
The final step was to just filter for Yes values and I had all the data that I needed.
My mistake was going to a pivot table first which put the data in a very funky format causing me to have to do a transpose, which wasn't really necessary. Hopefully this can help others....