Apache FOP - Complex Table

Apache FOP - Complex Table - pdf

I have a complex table (multiple header rows and details rows that point to a specific header row) that I need to render in a PDF and have it be ADA compliant. In Adobe I would make it compliant by opening the table editor. For each header cell I would go into table cell properties and give it a unique ID. For each detail row cell, I would go to cell properties and set "Associated Header Cells ID's.
Is there any way to access these properties through FOP?

Related

Create table schema and load data in bigquery table using source google drive

I am creating table using google drive as a source and google sheet as a format.
I have selected "Drive" as a value for create table from. For file Format, I selected Google Sheet.
Also I selected the Auto Detect Schema and input parameters.
Its creating the table but the first row of the sheet is also loaded as a data instead of table fields.
Kindly tell me what I need to do to get the first row of the sheet as a table column name not as a data.

It would have been helpful if you could include a screenshot of the top few rows of the file you're trying to upload at least to see the data types you have in there. BigQuery, at least as of when this response was composed, cannot differentiate between column names and data rows if both have similar datatypes while schema auto detection is used. For instance, if your data looks like this:
headerA, headerB
row1a, row1b
row2a, row2b
row3a, row3b
BigQuery would not be able to detect the column names (at least automatically using the UI options alone) since all the headers and row data are Strings. The "Header rows to skip" option would not help with this.
Schema auto detection should be able to detect and differentiate column names from data rows when you have different data types for different columns though.

You have an option to skip header row in Advanced options. Simply put 1 as the number of rows to skip (your first row is where your header is). It will skip the first row and use it as the values for your header.

How do I format/tag an accessible PDF table that spans multiple pages horizontally?

I'm responsible for remediating a PDF that has been generated by a third-party, proprietary system for which I have no access to the layout or design. The goal is to pass the adobe acrobat DC accessibility checker before publication.
Some of the tables in the PDF span multiple pages horizontally (i.e. with a page break at column 4 of 7). Thus far, I have designated each piece of text content as a "Cell" and grouped those into a "Table Row" tag and defined each header and sub-header as a "Table Header Cell".
However, Acrobat DC seems to get confused as to the relative size and spacing of each table element. It is creating phantom column headers and rearranging or combining rows in order to fit the appearance of a more standard layout PER PAGE. But since I need one cohesive table to span TWO PAGES, this is breaking my accessibility.
Depending on how I nest my table elements, I get a table layout like one of the two examples below:
Example when including blank cells for multi-column header rows
Example when defining the column span of multi-colum header rows as "7"
As you can see, the layout is not uniform and does not pass regularity checks. Plus, as I add more rows with several blank cells, the table editor produces an error that reads:
"Unknown Table Structure Encountered"
The only way I have managed to remove this error, is to exclude the bolded main-section sub-headers from the tag structure entirely, but I cannot just leave them as untagged content and pass the checker.
Please help.

Signed up just to comment to
Kevin, thanks for replying. Because of the malformed grid, I cannot even click on the cells on Page 2 in order to associate headers. Is there a way to define table structure without using the Table Editor mode? – Glamador Apr 3 at 12:27
but don't have the rep yet to do so:
Glamador - Knowing it can't help you half a year ago but might in the future: I encountered this in a document this week and figured out the "Why" and how to get the Table Editor back, but not the "Easiest/best way to solve" the tagging in Acrobat. This issue is denying you Table Editor is with the table header (TH) cell you created that spans multiple pages.
So if you set a table header cell to something like Row Span: 7, and 3 of those are on the second page Acrobat will give you the "Unknown table structure encountered. Please retag this table using the Reading Order Tool to possibly fix the problem." error any time you try to use the Table Editor on the table that has that [table header cell with a multi-page row span/I'm not working with but assume column span too].
To get your Table Editor use back (not solving the tagging of accessibility, but to quit getting that error on your table,):
Go to your tags
Create a new empty Table Header Cell
Drag the content displayed in the tag from the problem TH to your new TH
Delete the [multiple page row/column spanning, but now empty] problem TH
Repeat if you did this in multiple TH in the same table
You can now use Table Editor again
Note: Because you can't use the Table Editor once these problem headers have been created you can't use it to see which TH's you have set to span multiple pages, or see those row/column spans, so you're going to have to just look at your document if you went through tagging and are going back and checking later and figure out which are the likely problem headers to replace. If you create that header span again in the table that goes across multiple pages you'll be unable to use the Table Editor again until you delete that tag with the page spanning issue.
I haven't found if you can combine TH Row Span settings with IDs/Associated Header Cell IDs and have the user software identify both, so I've been doing the tedious ID association on large but simple tables as my "It's tagged correctly" option, but unfortunately it isn't nearly as fast and easy as Row Spans.

You can edit the tag's object properties by right-clicking on the tag and then you can add an ID there if it doesn't already have one. Be sure each data cell is associated with a header cell. PAC's screen reader preview will also give a good view of the layout to help you get everything associated correctly.

How to load a excel file which having header in two rows using pentaho

enter image description here
i have an excel file which having the header in two rows( first column header in second row and remaining columns header in 1st n 2nd row) as shown in image.
i have to load this excel into a table using pentaho.
please let me know how to load.
Thanks,

Actually, you define a cell block to read from, either implicitly (top left) or explicitly (giving offsets on the Sheets tab). You can tell Spoon that you want the first row of that block treated as a header row containing fieldnames. This allows you to populate the field list (button Get Fields) at design-time - a convenience feature.
If the names don't suite you, just change them.

VB.Net: Read Table from rtf-File

I have some RTF-Files with a table. Is there a way to get the content of the table into a datatable? Or is there a way to convert the table to csv?

I'll post this as a part answer only, as it is not complete, but can be used to solve the issue that you have.
From the document specified in my comment I found this detail...
Table Definitions
There is no RTF table group; instead, tables are specified as paragraph properties. A table is represented as a sequence of table rows. A table row is a contiguous series of paragraphs partitioned into cells. The table row begins with the \trowd control word and ends with the \row control word. Every paragraph that is contained in a table row must have the \intbl control word specified or inherited from the previous paragraph. A cell may have more than one paragraph in it; the cell is terminated by a cell mark (the \cell control word), and the row is terminated by a row mark (the \row control word). Table rows can also be positioned. In this case, every paragraph in a table row must have the same positioning controls (see the controls on the Positioned Objects and Frames subsection of this Specification. Table properties may be inherited from the previous row; therefore, a series of table rows may be introduced by a single .
You can find this detail from page 93 onward and does seem to provide the bulk of what you need to know.
From this point you should read the file into a string and then search it for each subsequent occurrence of \trowd (allowing for the closing \row command). This should allow the traversal of all tables within the RTF document. Using this method, and by analysing data within the table, you should be able to ascertain what is important to your requirements.

Multi page Word tables to PowerPoint while preserving headers

I have a macro that moves all pictures and tables to a PowerPoint while capturing the figure name and number as well as the table name and number. I am pasting the tables in as .Shapes.PasteSpecial(ppPasteMetafilePicture).
This has worked great in the past but I have come across about 150 documents that need to be converted that contain tables that span more than one page. When the macro pastes the table it cuts off at the first page.
If I split the table using the macro it does not carry over the headers.
What I want is to be able to do is split this table into multiple slides per Word document page that it is on and include the headers of the table.

Since you're pasting as a picture the only possibility is to EDIT the Word tables. You'd need to read how many rows comprise the table header, copy those rows, deactivate the table header setting, then paste the row(s) at the top of each page. Then you can copy each page. At the end, close the document without saving so that the original still has the table headers.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas