I am currently trying to figure out how I could possibly use the full-text index to allow users to search for Keyword(s) within the DB data. The issue I am having is that not only do I want to know the source table, but the source column as well so I can tell the user where I found the hit.
I can do it by using the INFORMATION_SCHEMA and building a large table, that I can build an index on, but then I have to keep that table in sync with the source tables.
Any other thoughts on how to do something like this?
Thanks,
S
Could you do separate queries?
Related
I want to find which tables/columns in Redshift remain unused in the database in order to do a clean-up.
I have been trying to parse the queries from the stl_query table, but it turns out this is a quite complex task for which I haven't found any library that I can use.
Anyone knows if this is somehow possible?
Thank you!
The column question is a tricky one. For table use information I'd look at stl_scan which records info about every table scan step performed by the system. Each of these is date-stamped so you will know when the table was "used". Just remember that system logging tables are pruned periodically and the data will go back for only a few days. So may need a process to view table use daily to get extended history.
I ponder the column question some more. One thought is that query ids will also be provided in stl_scan and this could help in identifying the columns used in the query text. For every query id that scans table_A search the query text for each column name of the table. Wouldn't be perfect but a start.
I heavily use bigQuery and there are now quite a number of intermediate tables. Because teammates can upload their own tables, I do not understand all the tables well.
I want to check if a table have not been used for a long time, then check if it can be deleted manually.
Is there anyone know how to do?
Many thanks
You could use logs if you have access. If you made yourself familiar with how to filter log entries you can find out about your usage quite easily: https://cloud.google.com/logging/docs/quickstart-sdk#explore
There's also the possibility of exporting logs to big query - so you could analyze them using SQL - I guess that's even more convenient.
You can get table specific meta data via the TABLES command.
SELECT *,TIMESTAMP_MILLIS(LAST_MODIFIED_TIME) ACCESS_DATE
FROM [DATASET].__TABLES__
The mentioned code snippet should provide you with the last access date.
I have a 2 tables that are old_test and new_test /bible database/
old_test table has 7959 rows
new_test table has 23145 rows
I want to use LIKE query to search verse from two tables.
For example:
SELECT *
FROM old_test
where text like "%'+searchword+'%"
union all
SELECT *
FROM new_test
where text like "%'+searchword+'%"
It works good but taking a lot of time to show the result.
What is the best solution to search much faster on above condition?
Thanks
Your query %searchword% cause table scan, it will get slower as number of records increase. Use searchword% query to get index base fast query.
What you need is full-text search, which is not available in websql.
I suggest my own open source library, https://github.com/yathit/ydn-db-fulltext for full-text search implementation. It works with newer IndexedDB API as well.
The main problem with your query is that you having to search entire fields segment by segment to find the string using like - building an index that can be queried instead should alleviate the problem.
Looking at Web SQL it uses the SQLite engine:
User agents must implement the SQL dialect supported by Sqlite 3.6.19.
http://www.w3.org/TR/webdatabase/#parsing-and-processing-sql-statements
Based on that, I would recommend trying to build a full-text index over the table to make these searches run quickly http://www.sqlite.org/fts3.html
I have a table in my sql server 2005 database which contains about 50 million records.
I have firstName and LastName columns, and I would like to be able to allow the user to search on these columns without it taking forever.
Out of indexing these columns, is there a way to make my query work fast?
Also, I want to search similar sounded names. for example, if the user searches for Danny, I would like to return records with the name Dan, Daniel as well. It would be nice to show the user a rank in % how close the result he got to what he actually searched.
I know this is a tuff task, but I bet I'm not the first one in the world that face this issue :)
Thanks for your help.
We have databases with half a billion of records (Oracle, but should have similar performances). You can search in it within a few milli seconds if you have proper indexes. In your case, place an index on firstname and lastname. Using binary-tree index will perform good and will scale with the size of your database. Careful, LIKE clauses often break the use of the index and degrades largely the performances. I know MySQL can keep using indexes with LIKE clauses when wildcards are only at the right of the string. You would have to make similar search for SQL Server.
String similarity is indeed not simple. Have a look at http://en.wikipedia.org/wiki/Category:String_similarity_measures, you'll see some of the possible algorithms. Cannot say if SQL Server do implement one of them, dont know this database. Try to Google "SQL Server" + the name of the algorithms to maybe find what you need. Otherwise, you have code provided on Wiki for various languages (maybe not SQL but you should be able to adapt them for a stored procedure).
Have you tried full text indexing? I used it on free text fields in a table over 1 million records, and found it to be pretty fast. Plus you can add synonyms to it, so that Dan, Danial, and Danny all index as the same (where you get the dictionary of name equivalents is a different story). It does allow wildcard searches as well. Full text indexing can also do rank, though I found it to be less useful on names (better for documents).
use FUll TEXT SEARCH enable for this table and those columns, that will create full text index for those columns.
I'm working with a SQL Server 2000 database that likely has a few dozen tables that are no longer accessed. I'd like to clear out the data that we no longer need to be maintaining, but I'm not sure how to identify which tables to remove.
The database is shared by several different applications, so I can't be 100% confident that reviewing these will give me a complete list of the objects that are used.
What I'd like to do, if it's possible, is to get a list of tables that haven't been accessed at all for some period of time. No reads, no writes. How should I approach this?
MSSQL2000 won't give you that kind of information. But a way you can identify what tables ARE used (and then deduce which ones are not) is to use the SQL Profiler, to save all the queries that go to a certain database. Configure the profiler to record the results to a new table, and then check the queries saved there to find all the tables (and views, sps, etc) that are used by your applications.
Another way I think you might check if there's any "writes" is to add a new timestamp column to every table, and a trigger that updates that column every time there's an update or an insert. But keep in mind that if your apps do queries of the type
select * from ...
then they will receive a new column and that might cause you some problems.
Another suggestion for tracking tables that have been written to is to use Red Gate SQL Log Rescue (free). This tool dives into the log of the database and will show you all inserts, updates and deletes. The list is fully searchable, too.
It doesn't meet your criteria for researching reads into the database, but I think the SQL Profiler technique will get you a fair idea as far as that goes.
If you have lastupdate columns you can check for the writes, there is really no easy way to check for reads. You could run profiler, save the trace to a table and check in there
What I usually do is rename the table by prefixing it with an underscrore, when people start to scream I just rename it back
If by not used, you mean your application has no more references to the tables in question and you are using dynamic sql, you could do a search for the table names in your app, if they don't exist blow them away.
I've also outputted all sprocs, functions, etc. to a text file and done a search for the table names. If not found, or found in procedures that will need to be deleted too, blow them away.
It looks like using the Profiler is going to work. Once I've let it run for a while, I should have a good list of used tables. Anyone who doesn't use their tables every day can probably wait for them to be restored from backup. Thanks, folks.
Probably too late to help mogrify, but for anybody doing a search; I would search for all objects using this object in my code, then in SQL Server by running this :
select distinct '[' + object_name(id) + ']'
from syscomments
where text like '%MY_TABLE_NAME%'