How to create each page as PDF from existing PDF using mPDF in Laravel - pdf

use ZanySoft\LaravelPDF\PDF;
$mpdf = new PDF();
$mpdf->SetImportUse();
$pagecount = $mpdf->SetSourceFile(storage_path() . '/app/public/applications/test.pdf');
// test.pdf have 2 pages $pagecount have value 2.
for($i=1; $i <= $pagecount ; $i++) {
$tplId = $mpdf->ImportPage($i);
$mpdf->addPage();
$mpdf->UseTemplate($tplId);
// store each page as pdf file
$new_file_name = "new-".$i.".pdf";
$mpdf->Output($new_file_name, \Mpdf\Output\Destination::FILE);
}
I am getting an error Class 'Mpdf\Output\Destination' not found. I have tried below things as second argument of Output();
$mpdf->Output($new_file_name, \PDF\Output\Destination::FILE);
$mpdf->Output($new_file_name, \Mpdf\Mpdf\Output\Destination::FILE);
But, it is not working. Please help to store fetch page store as PDF file direct on directory.

ZanySoft\LaravelPDF\PDF uses mPDF in version 6.1 which does not have output destination class constants yet.
Use a plain string:
$mpdf->Output($new_file_name, 'F');
Or use composer package mpdf/mpdf in version >=7 directly.

Related

Perl : Scrape website and how to download PDF files from the website using Perl Selenium:Chrome

So I'm studying Scraping website using Selenium:Chrome on Perl, I just wondering how can I download all pdf files from year 2017 to 2021 and store it into a folder from this website https://www.fda.gov/drugs/warning-letters-and-notice-violation-letters-pharmaceutical-companies/untitled-letters-2021 . So far this is what I've done
use strict;
use warnings;
use Time::Piece;
use POSIX qw(strftime);
use Selenium::Chrome;
use File::Slurp;
use File::Copy qw(copy);
use File::Path;
use File::Path qw(make_path remove_tree);
use LWP::Simple;
my $collection_name = "mre_zen_test3";
make_path("$collection_name");
#DECLARE SELENIUM DRIVER
my $driver = Selenium::Chrome->new;
#NAVIGATE TO SITE
print "trying to get toc_url\n";
$driver->navigate('https://www.fda.gov/drugs/warning-letters-and-notice-violation-letters-pharmaceutical-companies/untitled-letters-2021');
sleep(8);
#GET PAGE SOURCE
my $toc_content = $driver->get_page_source();
$toc_content =~ s/[^\x00-\x7f]//g;
write_file("toc.html", $toc_content);
print "writing toc.html\n";
sleep(5);
$toc_content = read_file("toc.html");
This script only download the entire content of the website. Hope someone here can help me and teach me. Thank you very much.
Here is some working code, to help you get going hopefully
use warnings;
use strict;
use feature 'say';
use Path::Tiny; # only convenience
use Selenium::Chrome;
my $base_url = q(https://www.fda.gov/drugs/)
. q(warning-letters-and-notice-violation-letters-pharmaceutical-companies/);
my $show = 1; # to see navigation. set to false for headless operation
# A little demo of how to set some browser options
my %chrome_capab = do {
my #cfg = ($show)
? ('window-position=960,10', 'window-size=950,1180')
: 'headless';
'extra_capabilities' => { 'goog:chromeOptions' => { args => [ #cfg ] } }
};
my $drv = Selenium::Chrome->new( %chrome_capab );
my #years = 2017..2021;
foreach my $year (#years) {
my $url = $base_url . "untitled-letters-$year";
$drv->get($url);
say "\nPage title: ", $drv->get_title;
sleep 1 if $show;
my $elem = $drv->find_element(
q{//li[contains(text(), 'PDF')]/a[contains(text(), 'Untitled Letter')]}
);
sleep 1 if $show;
# Downloading the file is surprisingly not simple with Selenium (see text)
# But as we found the link we can get its url and then use Selenium-provided
# user-agent (it's LWP::UserAgent)
my $href = $elem->get_attribute('href');
say "pdf's url: $href";
my $response = $drv->ua->get($href);
die $response->status_line if not $response->is_success;
say "Downloading 'Content-Type': ", $response->header('Content-Type');
my $filename = "download_$year.pdf";
say "Save as $filename";
path($filename)->spew( $response->decoded_content );
}
This takes shortcuts, switches approaches, and sidesteps some issues (which one need resolve for a fuller utility of this useful tool). It downloads one pdf from each page; to download all we need to change the XPath expression used to locate them
my #hrefs =
map { $_->get_attribute('href') }
$drv->find_elements(
# There's no ends-with(...) in XPath 1.0 (nor matches() with regex)
q{//li[contains(text(), '(PDF)')]}
. q{/a[starts-with(#href, '/media/') and contains(#href, '/download')]}
);
Now loop over the links, forming filenames more carefully, and download each like in the program above. I can fill the gaps further if there's need for that.
The code puts the pdf files on disk, in its working directory. Please review that before running this so to make sure that nothing gets overwritten!
See Selenium::Remove::Driver for starters.
Note: there is no need for Selenium for this particular task; it's all straight-up HTTP requests, no JavaScript. So LWP::UserAgent or Mojo would do it just fine. But I take it that you want to learn how to use Selenium, since it often is needed and is useful.

laravel 7 pdf-to-text | Could not read pdf file

I installed this laravel package to convert pdf file into text. https://github.com/spatie/pdf-to-text
I'm getting this error:
Could not read sample_1610656868.pdf
I tried passing the file statically by putting it in public folder and by giving path of uploaded file.
Here's my controller:
public function pdftotext(Request $request)
{
if($request->hasFile('file'))
{
// Get the file with extension
$filenameWithExt = $request->file('file')->getClientOriginalName();
//Get the file name
$filename = pathinfo($filenameWithExt, PATHINFO_FILENAME);
//Get the ext
$extension = $request->file('file')->getClientOriginalExtension();
//File name to store
$fileNameToStore = $filename.'_'.time().'.'.$extension;
//Upload File
$path = $request->file('file')->storeAS('public/ebutifier/pdf', $fileNameToStore);
}
// dd($path);
$location = public_path($path);
// $pdf = $request->input('file');
// dd($location);
// echo Pdf::getText($location, 'usr/bin/pdftotext');
$text = (new Pdf('public/ebutifier/pdf/'))
->setPdf($fileNameToStore)
->text();
return $text;
}
Not sure why it's not working any help would be appreciated.
You used spatie/pdf-to-text Package.
I also tried to use this package but it gave me this kind of error.
So, I used asika/pdf2text package and it worked properly
Asika/pdf2text pacakge
And this is my code
$path = public_path('/uploads/documents/'. $file_name);
$reader = new \Asika\Pdf2text;
$output = $reader->decode($path);
dd($output);
Hopefully, it will be helpful for you.

Property Photo Files with PHRETS v2

My php code, below, attemps to download all the photos for a property listing. It successfully queries the RETS server, and creates a file for each photo, but the file does not seem to be a functional image. (MATRIX requires files to be downloaded, instead of URLs.)
The list of photos below suggests that it successfully queries one listing id (47030752) for all photos that exist, (20 photos in this case). In a web browser, the files appear only as a small white square on a black background: e.g. (https://photos.atlantarealestate-homes.com/photos/PHOTO-47030752-9.jpg). The file size (4) also seems to be very low, as compared to that of a real photo.
du -s PHOTO*
4 PHOTO-47030752-10.jpg
4 PHOTO-47030752-11.jpg
4 PHOTO-47030752-12.jpg
4 PHOTO-47030752-13.jpg
4 PHOTO-47030752-14.jpg
4 PHOTO-47030752-15.jpg
4 PHOTO-47030752-16.jpg
4 PHOTO-47030752-17.jpg
4 PHOTO-47030752-18.jpg
4 PHOTO-47030752-19.jpg
4 PHOTO-47030752-1.jpg
4 PHOTO-47030752-20.jpg
4 PHOTO-47030752-2.jpg
4 PHOTO-47030752-3.jpg
4 PHOTO-47030752-4.jpg
4 PHOTO-47030752-5.jpg
4 PHOTO-47030752-6.jpg
4 PHOTO-47030752-7.jpg
4 PHOTO-47030752-8.jpg
4 PHOTO-47030752-9.jpg
script I'm using:
#!/usr/bin/php
<?php
date_default_timezone_set('this/area');
require_once("composer/vendor/autoload.php");
$config = new \PHRETS\Configuration;
$config->setLoginUrl('https://myurl/login.ashx')
->setUsername('myser')
->setPassword('mypass')
->setRetsVersion('1.7.2');
$rets = new \PHRETS\Session($config);
$connect = $rets->Login();
$system = $rets->GetSystemMetadata();
$resources = $system->getResources();
$classes = $resources->first()->getClasses();
$classes = $rets->GetClassesMetadata('Property');
$host="localhost";
$user="db_user";
$password="db_pass";
$dbname="db_name";
$tablename="db_table";
$link=mysqli_connect ($host, $user, $password, $dbname);
$query="select mlsno, matrix_unique_id, photomodificationtimestamp from fmls_homes left join fmls_images on (matrix_unique_id=mls_no and photonum='1') where photomodificationtimestamp <> last_update or last_update is null limit 1";
print ("$query\n");
$result= mysqli_query ($link, $query);
$num_rows = mysqli_num_rows($result);
print "Fetching Images for $num_rows Homes\n";
while ($Row= mysqli_fetch_array ($result)) {
$matrix_unique_id="$Row[matrix_unique_id]";
$objects = $rets->GetObject('Property', 'LargePhoto', $matrix_unique_id);
foreach ($objects as $object) {
// does this represent some kind of error
$object->isError();
$object->getError(); // returns a \PHRETS\Models\RETSError
// get the record ID associated with this object
$object->getContentId();
// get the sequence number of this object relative to the others with the same ContentId
$object->getObjectId();
// get the object's Content-Type value
$object->getContentType();
// get the description of the object
$object->getContentDescription();
// get the sub-description of the object
$object->getContentSubDescription();
// get the object's binary data
$object->getContent();
// get the size of the object's data
$object->getSize();
// does this object represent the primary object in the set
$object->isPreferred();
// when requesting URLs, access the URL given back
$object->getLocation();
// use the given URL and make it look like the RETS server gave the object directly
$object->setContent(file_get_contents($object->getLocation()));
$listing = $object->getContentId();
$number = $object->getObjectId();
$url = $object->getLocation();
//$photo = $object->getContent();
$size = $object->getSize();
$desc = $object->getContentDescription();
if ($number >= '1') {
file_put_contents("/bigdirs/fmls_pics/PHOTO-{$listing}-{$number}.jpg", "$object->getContent();");
print "$listing - $number - $size $desc\n";
} //end if
} //end foreach
} //end while
mysqli_close ($link);
fclose($f);
php?>
Are there any suggested changes to capture photos into the created files? This command creates the photo files:
file_put_contents("/bigdirs/fmls_pics/PHOTO-{$listing}-{$number}.jpg", "$object->getContent();");
There may be some parts of this script that wouldn't work in live production, but are sufficient for testing. This script seems to successfully query for the information needed from the RETS server. The problem is just simply that the actual files created do not seem to be functional photos.
Thanks in Advance! :)
Your code sample is a mix of the official documentation and a usable implementation. The problem is with this line:
$object->setContent(file_get_contents($object->getLocation()));
You should completely take that out. That's actually overriding the image you downloaded with nothing before you get a chance to save the contents to a file. With that removed, it should work as expected.

Magento2 read uploaded csv file

I have upload CSV file but problem is i don't know how to read data of uploaded csv file in Magento2.
Please help me anyone.
Thanks in advance.
You can do that like you could in Magento 1. In Magento 1 you would use
new Varien_File_Csvbut in Magento 2 you can do the same with \Magento\Framework\File\Csv. You can use the following code.
In your __construct() inject the following classes:
protected $_fileCsv;
public function __construct(
\Magento\Backend\App\Action\Context $context,
\Magento\Framework\Module\Dir\Reader $moduleReader,
\Magento\Framework\File\Csv $fileCsv
) {
$this->_moduleReader = $moduleReader;
$this->_fileCsv = $fileCsv;
parent::__construct($context); // If your class doesn't have a parent, you don't need to do this, of course.
}
Then you can use it like this:
// This is the directory where you put your CSV file.
$directory = $this->_moduleReader->getModuleDir('etc', 'Vendor_Modulename');
// This is your CSV file.
$file = $directory . '/your_csv_file_name.csv';
if (file_exists($file)) {
$data = $this->_fileCsv->getData($file);
// This skips the first line of your csv file, since it will probably be a heading. Set $i = 0 to not skip the first line.
for($i=1; $i<count($data); $i++) {
var_dump($data[$i]); // $data[$i] is an array with your csv columns as values.
}
}

CMS and store hi-resolution images in generated pdf

I'm looking for good CMS for publishing software manuals.
Requirements:
publish manual pages as web pages with thumbnails and shows full resolution after click on image,
exporting manual pages to a pdf file with full resolution images instead to thumbnails.
I found IMHO best wiki system named Tiki Wiki (https://info.tiki.org/) but when I export to pdf then I gets low resolution thumbnail.
I solve this problem by very simple Tiki Wiki code modification:
Modify lib/wiki-plugins/wikiplugin_img.php to force using full image resolution instead to thumbnail in print page mode (inserted code 1) and rescale images in generated HTML by 0.5 factor (inserted code 2):
[...]
function wikiplugin_img( $data, $params )
{
[...]
$imgdata = array_merge($imgdata, $params);
// inserted code 1 (~410 line)
if ($GLOBALS['section_class']=="tiki_wiki_page print"){
$imgdata['thumb'] = '';
}
// end of inserted code 1
//function calls
if ( !empty($imgdata['default']) || !empty($imgdata['mandatory'])) {
[...]
$fwidth = '';
$fheight = '';
if (isset(TikiLib::lib('parser')->option['indexing']) && TikiLib::lib('parser')->option['indexing']) {
$fwidth = 1;
$fheight = 1;
} else {
// inserted code 2 (~410 line)
if ($GLOBALS['section_class']=="tiki_wiki_page print"){
$fwidth = $imageObj->get_width() / 2;
$fheight = $imageObj->get_height() / 2;
} else {
$fwidth = $imageObj->get_width();
$fheight = $imageObj->get_height();
}
// end of inserted code 2 (~638 line)
}
[...]
Now, after printing to pdf by wkhtmltopdf we gets pdf with small but full resolution images.
Additional modifies:
Adds following lines to cms/cssmenus.css (or other css included in print mode) for increase bottom margin of image caption:
div.thumbcaption {
margin-bottom: 5mm;
}
Removes lines from 171 to ~175 in templates/tiki-show_content.tpl for remove the "The original document is available at" foot.