Searching Sqlite - sql

I was wondering if there was a way to search an entire SQLite database for one specific word. I do not know the column that it is in or even the table that it is in.
The table and row/column that contains this specific word also contains the other entries that i need to edit.
In-short:
Need to find a specific word
Can't query (i don't think i can atleast) since i don't know the table or column name that its located in.
I need to know where this specific word is referenced. In what table and row so I can access the others that are along side it.
Basically, is there a CTRL+F functionality of SQlite that searches the entirety of the SQLite file?
I have mac/windows/linux machines. I am not limited by software if that is a solution.

Any such functionality would essentially be running queries that check every column of every table. You can do that via a script that runs the following SQL:
1) Get a list of all the tables:
select name from sqlite_master where type = 'table'
2) For each table, get all of its columns (column name is available in the name field)
pragma table_info(cows)
3) Then for each table, generate a query that checks every field and run it:
select
*
from cows
where name like '%Daisy%'
or owner like '%Daisy%'
or farm like '%Daisy%'

Related

PDI /Kettle - Passing data from previous hop to database query

I'm new to PDI and Kettle, and what I thought was a simple experiment to teach myself some basics has turned into a lot of frustration.
I want to check a database to see if a particular record exists (i.e. vendor). I would like to get the name of the vendor from reading a flat file (.CSV).
My first hurdle selecting only the vendor name from 8 fields in the CSV
The second hurdle is how to use that vendor name as a variable in a database query.
My third issue is what type of step to use for the database lookup.
I tried a dynamic SQL query, but I couldn't determine how to build the query using a variable, then how to pass the desired value to the variable.
The database table (VendorRatings) has 30 fields, one of which is vendor. The CSV also has 8 fields, one of which is also vendor.
My best effort was to use a dynamic query using:
SELECT * FROM VENDORRATINGS WHERE VENDOR = ?
How do I programmatically assign the desired value to "?" in the query? Specifically, how do I link the output of a specific field from Text File Input to the "vendor = ?" SQL query?
The best practice is a Stream lookup. For each record in the main flow (VendorRating) lookup in the reference file (the CSV) for the vendor details (lookup fields), based on its identifier (possibly its number or name or firstname+lastname).
First "hurdle" : Once the path of the csv file defined, press the Get field button.
It will take the first line as header to know the field names and explore the first 100 (customizable) record to determine the field types.
If the name is not on the first line, uncheck the Header row present, press the Get field button, and then change the name on the panel.
If there is more than one header row or other complexities, use the Text file input.
The same is valid for the lookup step: use the Get lookup field button and delete the fields you do not need.
Due to the fact that
There is at most one vendorrating per vendor.
You have to do something if there is no match.
I suggest the following flow:
Read the CSV and for each row look up in the table (i.e.: the lookup table is the SQL table rather that the CSV file). And put default upon not matching. I suggest something really visible like "--- NO MATCH ---".
Then, in case of no match, the filter redirect the flow to the alternative action (here: insert into the SQL table). Then the two flows and merged into the downstream flow.

BigQuery - remove unused column from schema

I accidentally added a wrong column to my BigQuery table schema.
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
remove bad rows (rows with values contains the wrong column) by running a "select *" query on the table with some kind of filter, and saving result to same table.
removing the (now) unused column.
Is this functionality (or similar) supported?
Possibly the "save result to table" functionality can have a "compact schema" option.
The smallest time-saving way to remove a column from Big Query according to the documentation.
ALTER TABLE [table_name] DROP COLUMN IF EXISTS [column_name]
If your table does not consist of record/repeated type fields - your simple option is:
Select valid columns while filtering out bad records into new temp table
SELECT < list of original columns >
FROM YourTable
WHERE < filter to remove bad entries here >
Write above to temp table - YourTable_Temp
Make a backup copy of "broken" table - YourTable_Backup
Delete YourTable
Copy YourTable_Temp to YourTable
Check if all looks as expected and if so - get rid of temp and backup tables
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free
In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you
Create a table with Record type column
create a table with a column type RECORD
I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself
Below is the code to do it. Lets say c is the column that you wants to delete.
CREATE OR REPLACE TABLE transactions.test_table AS
SELECT * EXCEPT (c) FROM transactions.test_table;
Or second method and my favorite is by following below steps.
Write Select query with the columns you want to exclude.
Go to Query Settings
Query Settings
In Destination setting Set destination table for query results, enter project name, Dataset name and table name exactly same as you entered in Step 1.
In Destination table write preference select Overwrite table.
Destination table settings
Save the Query Setting and run the query.
Save results to table is your way to go. Try on the big table with the selected columns you are interested, and you can apply a limit to make it small.

SSIS Check Excel source rows redirect rows to another table on 'x' number of field matches

I work in a sales based environment and our data consists of 'leads'.
Let's say we record CompanyName, PhoneNumber, Address1 & PostCode(ZIP). These rows a seeded with a unique ID in the schema.
The leads come in from various sources and are compiled onto a spread sheet and then imported into SQL 2012 using SSIS.
After a validation check to see if a file exists we then use a simple data flow which consists of an Excel source, Derived Column, Data Conversion and finally an OLE DB Destination.
My requirement I'm sure has a relatively simple solution. I understand what I need to achieve is the first step. I need to take a sample of data from the last rolling two months, if 2 or more fields in the source excel file match the corresponding field in the destination sql table then I want to redirect to another table.
I am unsure of which combination of components I could use to achieve this. I believe that Fuzzy lookup may not be what I am looking for as I am looking to find exact field matches, I have looked at the lookup component but I am unsure if this is the way to go.
Could anyone please provide some advice on how I can best achieve this as simply as possible.
You can use the Lookup to check for matches in your existing table. However, it will be fairly complicated to implement the requirement of checking for any two or more fields matching. Your expression would be long and complex basically consisting of:
(using pseudo code for readability)
IIF((a=a AND b=b) OR (a=a AND c=c) OR (b=b AND c=c) OR ...and so on
for every combination of two columns you want to test
I would do this by importing the entire spreadsheet to a staging table, and doing the existing rows check in a SQL stored proc that moves the data to the desired destination table.

Database Search that Compares Results for Mutiple Seach Keywords

Fist, let me say I know very little about SQL language and am trying to learn (albeit very slowly). I have created a database table with columns for
ECOREGION_ID
ECOREGION_NAME
SPECIES_NAME
CLASS
so that there is one row for each species name in each ecoregion. My end goal is to create a form in which I can enter in multiple species names and search for the ecoregions they share. For example, if I enter into the 4 different search boxes "Tiger", "Red Panda", "Sloth Bear", and "Rhino" it would bring up a list of all the Ecoregions in which these four species share. I am wondering a few things:
Is my data set up in the correct way in order to do this or is there a more efficient way to set i t up?
What statement should I use to create an sql statement to perform the search I want?
What is the technical term for what I am wanting to do? I have tried many different searches on different forums and can't seem to find what I am looking for, mostly because I probably don't know what to search, lol.
Thanks,
-Drew
You have ECORegion_ID and ECORegion_Name in the same table. I would suggest create a separate table to hold ECORegions. This table would have both an ID and Name. The search table would then only have the ECORegion_ID. This process is called normalization. It basically reduces redundant data in your database.
You are looking for a SELECT statement, which is used to pull data out of one or more tables. The statement has a WHERE option to restrict which rows you bring back and an IN expression as part of the WHERE to allow you to look for multiple keywords.
Search for Normalization to see why to put region name in a separate table. Look up SQL Select to get syntax for the select statement you should get off to a good start

Access 2010 Database Clenup

I have problems with my records within my database, so I have a template with about 260,000 records and for each record they have 3 identification columns to determine what time period the record is from and location: one for year, one for month, and one for region. Then the information for identifying the specific item is TagName, and Description. The Problem I am having is when someone entered data into this database they entered different description for the same device, I know this because the tag name is the same. Can I write code that will go through the data base find the items with the same tag name and use one of the descriptions to replace the ones that are different to have a more uniform database. Also some devices do not have tag names so we would want to avoid the "" Case.
Also moving forward into the future I have added more columns to the database to allow for more information to be retrieved, is there a way that I can back fill the data to older records once I know that they have the same tag name and Description once the database is cleaned up? Thanks in advance for the information it is much appreciated.
I assume that this will have to be done with VBA of some sort to modify records by looking for the first record with that description and using a variable to assign that description to all the other items with the same tag name? I just am not sure of the correct VBA syntax to go about this. I assume a similar method would be used for the backfilling process?
Your question is rather broad and multifaceted, so I'll answer key parts in steps:
The Problem I am having is when someone entered data into this
database they entered different description for the same device, I
know this because the tag name is the same.
While you could fix up those inconsistencies easily enough with a bit of SQL code, it would be better to avoid those inconsistencies being possible in the first place:
Create a new table, let's call it 'Tags', with TagName and TagDescription fields, and with TagName set as the primary key. Ensure both fields have their Required setting to True and Allow Zero Length to False.
Populate this new table with all possible tags - you can do this with a one-off 'append query' in Access jargon (INSERT INTO statement in SQL).
Delete the tag description column from the main table.
Go into the Relationships view and add a one-to-many relation between the two tables, linking the TagName field in the main table to the TagName field in the Tags table.
As required, create a query that aggregates data from the two tables.
Also some devices do not have tag names so we would want to avoid the
"" Case.
In Access, the concept of an empty string ("") is different from the concept of a true blank or 'null'. As such, it would be a good idea to replace all empty strings (if there are any) with nulls -
UPDATE MyTable SET TagName = Null WHERE TagName = '';
You can then set the TagName field's Allow Zero Length property to False in the table designer.
Also moving forward into the future I have added more columns to the
database to allow for more information to be retrieved
Think less in terms of more columns than more tables.
I assume that this will have to be done with VBA of some sort to modify records
Either VBA, SQL, or the Access query designers (which create SQL code behind the scenes). In terms of being able to crunch through data the quickest, SQL is best, though pure VBA (and in particular, using the DAO object library) can be easier to understand and follow.