Is there a way to create filterable table in Docusaurus? - docusaurus

I'm thinking of using Docusaurus. One use case would be to have a quite long table (100 rows).
In order to imrove usability I want the user to filter the table by some columns. Is that possible?

Related

Populating a PostgreSQL table with sample data using summary stats

My customer has a table with ~150 columns in their DB. I don't have access to the DB, I only have summary stats about each column in the table - distinct values in each column along with their likelihood of occurrence.
I'm trying to create a representative copy of this table on my own DB so that I can run queries against it for testing purposes. The only way I know to do this is to write a huge SELECT statement that uses the random() function to randomly choose between the possible values of each column (and other methods for timestamps and IDs). This SELECT is then used inside an INSERT INTO.
This approach just isn't scalable though. I want to be able to do this for a lot more tables. Is there an easier way to do this? I'd like to avoid paid tools if possible.

In PostgreSQL, efficiently using a table for every row in another table

I am sorry for the lack of notation in my question but I am not too familiar with SQL. Despite searching the internet for a decent amount of hours, I couldn't find that how to do efficiently what I wanted to do, but that is maybe because I am not familiar with the notation. Here comes the question:
I want to create a table, say Forms, in which each Form row has an ID, some metadata and a pointer(?) to the table of that Form row, lets say Form12 table, which directs me to Form12 table. I need it because every Form has different number, name and type of columns depending on users configuration for a particular Form.
So, I thought I can put the Table ID of Form12 as a column to Form table. But is this approach considered OK, or is there a better way to do it?
Thank you for your time.
Storing the names of tables in a column is generally not a good solution in a relational database. In order to use the information, you need to use dynamic SQL.
I would instead ask why you cannot store the information in a single table or well-defined sets of tables. Postgres has lots of options to help with this:
NULL data values, so columns do not need to be filled in.
Table inheritance, so tables can share columns.
JSON columns to support a flexible set of columns.
Entity-attribute-value (EAV) data models, which allow for lots of flexibility.

Performance Improve on SQL Large table

Im having 260 columns table in SQL server. When we run "Select count(*) from table" it is taking almost 5-6 to get the count. Table contains close 90-100 million records with 260 columns where more than 50 % Column contains NULL. Apart from that, user can also build dynamic sql query on to table from the UI, so searching 90-100 million records will take time to return results. Is there way to improve find functionality on a SQL table where filter criteria can be anything , can any1 suggest me fastest way get aggregate data on 25GB data .Ui should get hanged or timeout
Investigate horizontal partitioning. This will really only help query performance if you can force users to put the partitioning key into the predicates.
Try vertical partitioning, where you split one 260-column table into several tables with fewer columns. Put all the values which are commonly required together into one table. The queries will only reference the table(s) which contain columns required. This will give you more rows per page i.e. fewer pages per query.
You have a high fraction of NULLs. Sparse columns may help, but calculate your percentages as they can hurt if inappropriate. There's an SO question on this.
Filtered indexes and filtered statistics may be useful if the DB often runs similar queries.
As the guys state in the comments you need to analyse a few of the queries and see which indexes would help you the most. If your query does a lot of searches, you could use the full text search feature of the MSSQL server. Here you will find a nice reference with good examples.
Things that came me up was:
[SQL Server 2012+] If you are using SQL Server 2012, you can use the new Columnstore Indexes.
[SQL Server 2005+] If you are filtering a text column, you can use Full-Text Search
If you have some function that you apply frequently in some column (like SOUNDEX of column, for example), you could create PERSISTED COMPUTED COLUMN to not having to compute this value everytime.
Use temp tables (indexed ones will be much better) to reduce the number of rows to work on.
#Twelfth comment is very good:
"I think you need to create an ETL process and start changing this into a fact table with dimensions."
Changing my comment into an answer...
You are moving from a transaction world where these 90-100 million records are recorded and into a data warehousing scenario where you are now trying to slice, dice, and analyze the information you have. Not an easy solution, but odds are you're hitting the limits of what your current system can scale to.
In a past job, I had several (6) data fields belonging to each record that were pretty much free text and randomly populated depending on where the data was generated (they were search queries and people were entering what they basically would enter in google). With 6 fields like this...I created a dim_text table that took each entry in any of these 6 tables and replaced it with an integer. This left me a table with two columns, text_ID and text. Any time a user was searching for a specific entry in any of these 6 columns, I would search my dim_search table that was optimized (indexing) for this sort of query to return an integer matching the query I wanted...I would then take the integer and search for all occourences of the integer across the 6 fields instead. searching 1 table highly optimized for this type of free text search and then querying the main table for instances of the integer is far quicker than searching 6 fields on this free text field.
I'd also create aggregate tables (reporting tables if you prefer the term) for your common aggregates. There are quite a few options here that your business setup will determine...for example, if each row is an item on a sales invoice and you need to show sales by date...it may be better to aggregate total sales by invoice and save that to a table, then when a user wants totals by day, an aggregate is run on the aggreate of the invoices to determine the totals by day (so you've 'partially' aggregated the data in advance).
Hope that makes sense...I'm sure I'll need several edits here for clarity in my answer.

How can I add a column to a postgres table in front of the others?

I have a table with lots of columns, and I'd like to add two more (date and time) to the front of the existing table.
There is no data in the table right now, but I'm wondering what the best way is get the table in the format I need it.
I could just drop the table and create a new one with the correct configuration, but I'm wondering if there is a better way?
This is currently not possible. You have to drop and recreate the table.
Theoretically you could add the column, drop and re-add all other columns, but that's hardly practical.
It's an ongoing discussion and an open TODO-item of the Postgres project to allow reordering of columns. But a lot of dependencies and related considerations make that hard.
Quoting the Postgres project's ToDo List:
Allow column display reordering by recording a display, storage, and
permanent id for every column?
Contrary to what some believe, the order of columns in a table is not irrelevant, for multiple reasons.
The default order is used for statements like INSERT without column definition lists.
Or SELECT *, which returns columns in the predefined order.
The composite type of the table uses the same order of columns.
The order of columns is relevant for storage optimization (padding and alignment matter). More:
Calculating and saving space in PostgreSQL
People may be confusing this with the order of rows, which in undefined in a table.
In relational databases the order of columns in a table is irrelevant
Create a view that shows you the columns in the order you want
If you still want to, drop the table and recreate it

PostgreSQL - What's the absolute fastest way to exclude a certain set of rows in all table searches

I run a recipe website that uses PostgreSQL 9.1 as a backend. When a user searches for recipes, I build a query on the fly depending on what the user wants to find. For example, if the user wants to find all recipes that have a cook time under 30 minutes, I would generate the query:
SELECT * From Recipes WHERE CookTime < 30;
I now have the need to "hide" certain recipes, meaning they will never show up in any search, ever. The only way to get to them would be knowing the URL directly. To do this, I've added a new column to the Recipes table:
ALTER TABLE Recipes ADD COLUMN Hidden boolean not null default false;
CREATE INDEX IDX_Recipes_Hidden ON Recipes(Hidden);
My idea is to just hard code the phrase "NOT HIDDEN" into every WHERE clause. For example, the query above would now be:
select * from recipes where not Hidden and CookTime < 30;
My Question:
According to the query analyzer, this will now generate a bitmap to combine the two indexes. Keep in mind 99% of the recipes will not be hidden. I want to know if this technique is the best, and fastest way to exclude certain recipes from all queries. I know the absolute fastest way would be to create a separate table for hidden recipes, however this would be a massive amount of re-factoring so I'd like to avoid this.
Do you have any performance issues? If there are no issues with your solution it makes no sense to waste more time on something that needs no change.
A bitmap index is fine for something where you have not many different values. So in your case where you only have true and false it is fine.
You could just build something like a materialized view but this seems to be to much work and it would be probably easier for you to just create a second table, but if you do not have any issues don't change anything.
MVs in postgres: http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views
The fastest way to stop rows from showing up ever again is... delete them.
But if you want them round for some purpose, but don't want them for almost all queries, you could rename the table and create a new view in its place.
ALTER TABLE Recipes RENAME TO AllRecipes;
ALTER TABLE AllRecipes ADD Hidden BOOLEAN NOT NULL DEFAULT FALSE;
CREATE VIEW Recipes AS SELECT * FROM AllRecipes WHERE NOT Hidden;
This is fastest in terms of how much code you will need to rewrite (assuming you have many queries on your app on Recipies, and want all of them to exclude the hidden ones).
But it also gives you easy options to make it fast for performance too. For a start you can add an index on Hidden. But you can also partition it into two subtables, VisibleRecipes and HiddenRecipies. The view Recipes will show exactly the ones in VisibleRecipies.
But the table AllRecipies could be either a parent table with VisibleRecipes and HiddenRecipes as its partitions, or it could be a view itself.
If you don't have performance issues is ok.
If I was the engine, I would use the index to get the table rows with CookTime lesser than 30, and after this I would filter those with hidden = true.
If you know how to enforce this(use of cooktime index only), is fine to test it.
But if your analyser find usage of two indexes faster...
Be sure you have statistics on the tables and indexes collected.
(I have expertise on Oracle, not Postgres)