Welcome to jfscripts’s documentation!¶
Contents:
Comande line interfaces¶
dns-ipv6-prefix.py¶
Get the ipv6 prefix from a DNS name.
usage: dns-ipv6-prefix.py [-h] [-V] dnsname
Positional Arguments¶
dnsname | The DNS name, e. g. josef-friedrich.de |
Named Arguments¶
-V, --version | show program’s version number and exit |
extract-pdftext.py¶
usage: extract-pdftext.py [-h] [-c] [-v] [-V] file
Positional Arguments¶
file | A PDF file containing text |
Named Arguments¶
-c, --colorize | Colorize the terminal output. Default: False |
-v, --verbose | Make the command line output more verbose. Default: False |
-V, --version | show program’s version number and exit |
find-dupes-by-size.py¶
Find duplicate files by size.
usage: find-dupes-by-size.py [-h] [-V] path
Positional Arguments¶
path | A directory to recursively search for duplicate files. |
Named Arguments¶
-V, --version | show program’s version number and exit |
list-files.py¶
This is a script to demonstrate the list_files() function in this file.
list-files.py a.txt list-files.py a.txt b.txt c.txt list-files.py (asterisk).txt list-files.py “(asterisk).txt” list-files.py dir/ list-files.py “dir/(asterisk).txt”
usage: list-files.py [-h] [-V] input_files [input_files ...]
Positional Arguments¶
input_files | Examples for this arguments are: “a.txt”, “a.txt b.txt c.txt”, “(asterisk).txt”, “”(asterisk).txt””, “dir/”, “”dir/(asterisk).txt”” |
Named Arguments¶
-V, --version | show program’s version number and exit |
mac-to-eui64.py¶
Convert mac addresses to EUI64 ipv6 addresses.
usage: mac-to-eui64.py [-h] [-V] mac prefix
Positional Arguments¶
mac | The mac address. |
prefix | The ipv6 /64 prefix. |
Named Arguments¶
-V, --version | show program’s version number and exit |
pdf-compress.py¶
Convert and compress PDF scans. Make scans suitable for imslp.org (International Music Score Library Project). See also http://imslp.org/wiki/IMSLP:Musiknoten_beisteuern. The output files are monochrome bitmap images at a resolution of 600 dpi and the compression format CCITT group 4.
usage: pdf-compress.py [-h] [-c] [-m] [-N] [-v] [-V]
{convert,con,c,extract,ex,e,join,jn,j,samples,sp,s,unify,un,u}
...
Positional Arguments¶
subcommand | Possible choices: convert, con, c, extract, ex, e, join, jn, j, samples, sp, s, unify, un, u Subcommand |
Named Arguments¶
-c, --colorize | Colorize the terminal output. Default: False |
-m, --multiprocessing | |
Use multiprocessing to run commands in parallel. Default: False | |
-N, --no-cleanup | |
Don’t clean up the temporary files. Default: False | |
-v, --verbose | Make the command line output more verbose. Default: False |
-V, --version | show program’s version number and exit |
Sub-commands:¶
convert (con, c)¶
Convert scanned images (can be many image file formats or a PDF files) in monochrome bitmap images. The resulting images are compressed using the CCITT group 4 compression.
pdf-compress.py convert [-h] [-a | -C | -P] [-b] [--blur BLUR] [-B] [-c] [-d]
[-e] [-f] [-j] [-o]
[-l OCR_LANGUAGE [OCR_LANGUAGE ...]] [-p] [-n]
[-q QUALITY] [-r] [-t THRESHOLD] [-T] [-u]
input_files [input_files ...]
Positional Arguments¶
input_files |
|
Named Arguments¶
-a, --auto-black-white | |
The same as “–deskew –join –ocr –pdf –resize –trim –unify” Default: False | |
-C, --auto-color | |
The same as “–color –deskew –join –ocr –pdf –resize –trim –unify” Default: False | |
-P, --auto-png | The same as “–deskew –resize –trim” Default: False |
-b, --backup | Backup original images (add _backup.ext to filename). Default: False |
--blur | Blur images for better jpeg2000 compression rate. Default: False |
-B, --border | Frame the images with a white border. Default: False |
-c, --color | The input files are colored images. Default: False |
-d, --deskew | Straighten the images. Default: False |
-e, --enlighten-border | |
Enlighten the border. Default: False | |
-f, --force | Overwrite the output file even if it exists and it seems to be already converted. Default: False |
-j, --join | Join single paged PDF files to one PDF file. This option takes only effect with the option –pdf. Default: False |
-o, --ocr | Perform optical character recognition (OCR) on the input files.The output format must be PDF. Default: False |
-l, --ocr-language | |
Run tesseract –list-langs to get your installed languages. | |
-p, --pdf | Generate a PDF file. Default: False |
-n, --png | Generate a PNG file. Default: False |
-q, --quality | Compress the input images in a specific quality. The command automatically turns into the color mode. Default: False |
-r, --resize | Resize 200 percent. Default: False |
-t, --threshold | |
Threshold for monochrome, black and white images, default 50 percent. Colors above the threshold will be white and below will be black. Default: 50% | |
-T, --trim | This option removes any edges that are exactly the same color as the corner pixels. Default: False |
-u, --unify | Unify the page size of all pages in a PDF File. The output must be a joined PDF. Default: False |
extract (ex, e)¶
Extract images from a PDF file and export them in the TIFF format.
pdf-compress.py extract [-h] input_file [input_file ...]
Positional Arguments¶
input_file | A pdf file |
join (jn, j)¶
Join the input files into a single PDF file. If the input file is not PDF file, it is converted into a monochrome CCITT Group 4 compressed PDF file.
pdf-compress.py join [-h] [-o] [-l OCR_LANGUAGE [OCR_LANGUAGE ...]]
input_files [input_files ...]
Positional Arguments¶
input_files |
|
Named Arguments¶
-o, --ocr | Perform optical character recognition (OCR) on the input files. Default: False |
-l, --ocr-language | |
Run tesseract –list-langs to get your installed languages. |
samples (sp, s)¶
Convert the samge image with different threshold values to find the best threshold value.
pdf-compress.py samples [-h] [-b] [-q] [-t] input_file
Positional Arguments¶
input_file | A image or a PDF file. The script selects randomly one page of a multipaged PDF to build the series with differnt threshold values. |
Named Arguments¶
-b, --blur | Convert images on different blur values. Default: False |
-q, --quality | Compress to JPEG2000 images in different quality steps. Default: False |
-t, --threshold | |
Convert images on different threshold values to monochrome black and white images. Default: False |
image-into-pdf.py¶
Add or replace one page in a PDF file with an image file of the same page size.
usage: image-into-pdf.py [-h] [-c] [-v] [-V]
{add,ad,a,convert,cv,c,replace,re,r} ...
Positional Arguments¶
subcmd_args | Possible choices: add, ad, a, convert, cv, c, replace, re, r Subcmd_args |
Named Arguments¶
-c, --colorize | Colorize the terminal output. Default: False |
-v, --verbose | Make the cmd_args line output more verbose. Default: False |
-V, --version | show program’s version number and exit |
Sub-commands:¶
add (ad, a)¶
Add one image to an PDF file.
image-into-pdf.py add [-h] [-a AFTER | -b BEFORE | -f | -l] image pdf
Positional Arguments¶
image | A image (or a PDF) file to add to the PDF page. |
The PDF file. |
Named Arguments¶
-a, --after | Place image after page X. |
-b, --before | Place image before page X. |
-f, --first | Place the image to the first position. Default: False |
-l, --last | Place the image to the last position. Default: False |
jfscripts¶
jfscripts package¶
Submodules¶
jfscripts._utils module¶
-
class
jfscripts._utils.
FilePath
(path, absolute=False)[source]¶ Bases:
object
-
absolute
= None¶ Boolean, indicates wheter the path is an absolute path or an relative path.
-
base
= None¶ The path without an extension, e. g. /home/document/file.
-
basename
= None¶ The basename of the file, e. g. file.
-
extension
= None¶ The extension of the file, e. g. ext.
-
filename
= None¶ The filename is the combination of the basename and the extension, e. g. file.ext.
-
new
(extension=None, append='', del_substring='')[source]¶ Parameters: Returns: A new file path object.
Return type:
-
path
= None¶ The absolute (/home/document/file.ext) or the relative path (document/file.ext) of the file.
-
-
class
jfscripts._utils.
Run
(*args, **kwargs)[source]¶ Bases:
object
-
PIPE
= -1¶
-
run
(*args, **kwargs)[source]¶ Returns: A CompletedProcess object. Return type: subprocess.CompletedProcess
-
-
jfscripts._utils.
argparser_to_readme
(argparser, template='README-template.md', destination='README.md', indentation=0, placeholder='{{ argparse }}')[source]¶ Add the formatted help output of a command line utility using the Python module argparse to a README file. Make sure to set the name of the program (prop) or you get strange program names.
Parameters: - argparser (object) – The argparse parser object.
- template (str) – The path of a template text file containing the placeholder. Default: README-template.md
- destination (str) – The path of the destination file. Default: README.me
- indentation (int) – Indent the formatted help output by X spaces. Default: 0
- placeholder (str) – Placeholder string that gets replaced by the formatted help output. Default: {{ argparse }}
-
jfscripts._utils.
check_dependencies
(*executables, raise_error=True)[source]¶ Check if the given executables are existing in $PATH.
Parameters: - executables (tuple) – A tuple of executables to check for their existence in $PATH. Each element of the tuple can be either a string (e. g. pdfimages) or a itself a tuple (‘pdfimages’, ‘poppler’). The first entry of this tuple is the name of the executable the second entry is a description text which is displayed in the raised exception.
- raise_error (bool) – Raise an error if an executable doesn’t exist.
Returns: True or False. True if all executables exist. False if one or more executables not exist.
Return type:
jfscripts.dns_ipv6_prefix module¶
-
jfscripts.dns_ipv6_prefix.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
jfscripts.extract_pdftext module¶
-
jfscripts.extract_pdftext.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
jfscripts.find_dupes_by_size module¶
-
jfscripts.find_dupes_by_size.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
jfscripts.list_files module¶
-
jfscripts.list_files.
_split_glob
(glob_path)[source]¶ Split a file path (e. g.: /data/(asterisk).txt) containing glob wildcard characters in a glob free path prefix (e. g.: /data) and a glob pattern (e. g. (asterisk).txt).
Parameters: glob_path (str) – A file path containing glob wildcard characters.
-
jfscripts.list_files.
doc_examples
(command_name='', extension='txt', indent_spaces=0, inline=False)[source]¶
-
jfscripts.list_files.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
-
jfscripts.list_files.
list_files
(files, default_glob=None)[source]¶ Parameters: - files (list) – A list of file paths or a single element list containing a glob string.
- default_glob (string) – A default glob pattern like “(asterisk).txt”. This argument is only taken into account, if “element” is a list with only one entry and this entry is a path to a directory.
jfscripts.mac_to_eui64 module¶
-
jfscripts.mac_to_eui64.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
jfscripts.pdf_compress module¶
-
class
jfscripts.pdf_compress.
State
(args)[source]¶ Bases:
object
This object holds runtime data for the multiprocessing environment.
-
args
= None¶ argparse arguments
-
common_path
= None¶ The common path prefix of all input files.
-
cwd
= None¶ The current working directory
-
first_input_file
= None¶ The first input file.
-
input_files
= None¶ A list of all input files.
-
input_is_pdf
= None¶ Boolean that indicates if the first file is a pdf.
-
-
class
jfscripts.pdf_compress.
Timer
[source]¶ Bases:
object
Class to calculate the execution time. Mainly to test the speed improvements of the multiprocessing implementation.
-
begin
= None¶ UNIX timestamp the execution began.
-
end
= None¶ UNIX timestamp the execution ended.
-
-
jfscripts.pdf_compress.
_do_magick_command
(command)[source]¶ ImageMagick version 7 introduces a new top level command named magick. Use this newer command if present.
Returns: A list of command segments
-
jfscripts.pdf_compress.
_do_magick_convert_enlighten_border
(width, height)[source]¶ Build the command line arguments to enlighten the border in four regions.
Parameters: Returns: Command line arguments for imagemagicks’ convert.
Return type:
-
jfscripts.pdf_compress.
args
= None¶ The argparse object.
-
jfscripts.pdf_compress.
check_threshold
(value)[source]¶ Check if value is a valid threshold value.
Parameters: value (integer or string) – Returns: A normalized threshold string (90%) Return type: string
-
jfscripts.pdf_compress.
cleanup
(state)[source]¶ Delete all images using the temporary identifier in a common path.
Parameters: state (jfscripts.pdf_compress.State) – The state object. Returns: None
-
jfscripts.pdf_compress.
collect_images
(state)[source]¶ Collection all images using the temporary identifier in a common path.
Parameters: state (jfscripts.pdf_compress.State) – The state object. Returns: A sorted list of image paths. Return type: list
-
jfscripts.pdf_compress.
convert_file_paths
(files)[source]¶ Convert a list of file paths in a list of
jfscripts._utils.FilePath
objects.Parameters: files (list) – A list of file paths Returns: a list of jfscripts._utils.FilePath
objects.
-
jfscripts.pdf_compress.
do_magick_convert
(input_file, output_file, threshold=None, enlighten_border=False, border=False, resize=False, deskew=False, trim=False, color=False, quality=75, blur=False)[source]¶ Convert a input image file using the subcommand convert of the imagemagick suite.
Returns: The output image file. Return type: jfscripts._utils.FilePath
-
jfscripts.pdf_compress.
do_magick_identify
(input_file)[source]¶ The different informations of an image.
Parameters: input_file (jfscripts._utils.FilePath) – The input file. Returns: A directory with the keys width, height and colors. Return type: dict
-
jfscripts.pdf_compress.
do_pdfimages
(pdf_file, state, page_number=None, use_tmp_identifier=True)[source]¶ Convert a PDF file to images in the TIFF format.
Parameters: - pdf_file (jfscripts._utils.FilePath) – The input file.
- state (jfscripts.pdf_compress.State) – The state object.
- page_number (int) – Extract only the page with a specific page number.
Returns: The return value of subprocess.run.
Return type:
-
jfscripts.pdf_compress.
do_pdfinfo_page_count
(pdf_file)[source]¶ Get the amount of pages a PDF files have.
Parameters: pdf_file (str) – Path of the PDF file. Returns: Page count Return type: int
-
jfscripts.pdf_compress.
do_pdftk_cat
(pdf_files, state)[source]¶ Join a list of PDF files into a single PDF file using the tool pdftk.
Parameters: - pdf_files (list) – a list of PDF files
- state (jfscripts.pdf_compress.State) – The state object.
Returns: None
-
jfscripts.pdf_compress.
get_parser
()[source]¶ The argument parser for the command line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
-
jfscripts.pdf_compress.
identifier
= 'magick'¶ To allow better assignment of the output files.
-
jfscripts.pdf_compress.
subcommand_convert_file
(arguments)[source]¶ Manipulate one input file
Parameters: arguments (tuple) – A tuple containing two elements: The first element is the input_file file object and the second element is the state object.
-
jfscripts.pdf_compress.
subcommand_samples
(input_file, state)[source]¶ Generate a list of example files with different threshold values.
Parameters: - input_file (jfscripts._utils.FilePath) – The input file.
- state (jfscripts.pdf_compress.State) – The state object.
Returns: None
-
jfscripts.pdf_compress.
tmp_identifier
= 'magick_1452b966-b582-11ea-9326-0242ac110002'¶ Used for the identification of temporary files.
jfscripts.image_into_pdf module¶
-
jfscripts.image_into_pdf.
assemble_pdf
(main_pdf, insert_pdf, page_count, page_number, mode='add', position='before')[source]¶ Parameters: - main_pdf (str) – Path of the main PDF file.
- insert_pdf (str) – Path of the PDF file to insert into the main PDF file.
- page_count (int) – Page count of the main PDF file.
- page_number (int) – Page number in the main PDF file to add / to replace the insert PDF file.
- mode (string) – Mode how the PDF to insert is treated. Possible choices are: add or replace.
- position (str) – Possible choices: before and after
-
jfscripts.image_into_pdf.
convert_image_to_pdf_page
(image, image_width, pdf_width, pdf_density_x)[source]¶
-
jfscripts.image_into_pdf.
do_pdftk_cat_first_page
(pdf_file)[source]¶ The cmd_args magick identify is very slow on page pages hence it examines every page. We extract the first page to get some informations about the dimensions of the PDF file.
-
jfscripts.image_into_pdf.
get_parser
()[source]¶ The argument parser for the cmd_args line interface.
Returns: A ArgumentParser object. Return type: argparse.ArgumentParser
jfscripts¶
A collection of my personal Python scripts.
dns-ipv6-prefix.py¶
usage: dns-ipv6-prefix.py [-h] [-V] dnsname
Get the ipv6 prefix from a DNS name.
positional arguments:
dnsname The DNS name, e. g. josef-friedrich.de
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
extract-pdftext.py¶
usage: extract-pdftext.py [-h] [-c] [-v] [-V] file
positional arguments:
file A PDF file containing text
optional arguments:
-h, --help show this help message and exit
-c, --colorize Colorize the terminal output.
-v, --verbose Make the command line output more verbose.
-V, --version show program's version number and exit
find-dupes-by-size.py¶
usage: find-dupes-by-size.py [-h] [-V] path
Find duplicate files by size.
positional arguments:
path A directory to recursively search for duplicate files.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
list-files.py¶
usage: list-files.py [-h] [-V] input_files [input_files ...]
This is a script to demonstrate the list_files() function in this file.
list-files.py a.txt
list-files.py a.txt b.txt c.txt
list-files.py (asterisk).txt
list-files.py "(asterisk).txt"
list-files.py dir/
list-files.py "dir/(asterisk).txt"
positional arguments:
input_files Examples for this arguments are: “a.txt”, “a.txt b.txt
c.txt”, “(asterisk).txt”, “"(asterisk).txt"”, “dir/”,
“"dir/(asterisk).txt"”
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
mac-to-eui64.py¶
usage: mac-to-eui64.py [-h] [-V] mac prefix
Convert mac addresses to EUI64 ipv6 addresses.
positional arguments:
mac The mac address.
prefix The ipv6 /64 prefix.
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
pdf-compress.py¶
usage: pdf-compress.py [-h] [-c] [-m] [-N] [-v] [-V]
{convert,con,c,extract,ex,e,join,jn,j,samples,sp,s,unify,un,u}
...
Convert and compress PDF scans. Make scans suitable for imslp.org
(International Music Score Library Project). See also
http://imslp.org/wiki/IMSLP:Musiknoten_beisteuern. The output files are
monochrome bitmap images at a resolution of 600 dpi and the compression format
CCITT group 4.
positional arguments:
{convert,con,c,extract,ex,e,join,jn,j,samples,sp,s,unify,un,u}
Subcommand
optional arguments:
-h, --help show this help message and exit
-c, --colorize Colorize the terminal output.
-m, --multiprocessing
Use multiprocessing to run commands in parallel.
-N, --no-cleanup Don’t clean up the temporary files.
-v, --verbose Make the command line output more verbose.
-V, --version show program's version number and exit
image-into-pdf.py¶
usage: image-into-pdf.py [-h] [-c] [-v] [-V]
{add,ad,a,convert,cv,c,replace,re,r} ...
Add or replace one page in a PDF file with an image file of the same page
size.
positional arguments:
{add,ad,a,convert,cv,c,replace,re,r}
Subcmd_args
optional arguments:
-h, --help show this help message and exit
-c, --colorize Colorize the terminal output.
-v, --verbose Make the cmd_args line output more verbose.
-V, --version show program's version number and exit