mla_extract_pdf_metadata()
_build_pdf_indirect_objects()
_extract_pdf_trailer()
_find_pdf_indirect_dictionary()
_parse_pdf_LPD_dictionary()
_parse_pdf_UTF16BE()
_parse_pdf_dictionary()
_parse_pdf_string()
_parse_pdf_xref_section()
_parse_pdf_xref_stream()
_parse_pdf_xref_subsection()
$pdf_indirect_objects
Class MLA (Media Library Assistant) PDF extracts legacy and XMP meta data from PDF files
package | Media Library Assistant |
---|---|
since | 2.10 |
_build_pdf_indirect_objects(string $string) : void
Creates the array of indirect object offsets and lengths
since | 2.10 |
---|
string
The entire PDF document, passsed by reference
_extract_pdf_trailer(string $file_name, integer $file_offset) : mixed
since | 2.10 |
---|
string
full path to the desired file
integer
offset within file of the cross-reference table
mixed
array of "PDF dictionary arrays", newest first, or NULL on failure_find_pdf_indirect_dictionary(string $file_name, integer $object, integer $generation, integer $instance) : mixed
The function searches the entire file, if necessary, to find the last/most recent copy of the object. This is required because Adobe Acrobat does NOT increment the generation number when it reuses an object.
since | 2.10 |
---|
string
full path and file name
integer
The object number
integer
The object generation number; default zero (0)
integer
The desired object instance (when multiple instances are present); default "highest/latest"
mixed
NULL on failure else array( 'start' => offset in the file, 'length' => object length, 'content' => dictionary contents )_parse_pdf_LPD_dictionary(string $source_string, integer $filesize) : mixed
Returns an array of dictionary contents, classified by object type: boolean, numeric, string, hex (string), indirect (object), name, array, dictionary, stream, and null. The array also has a '/length' element containing the number of bytes occupied by the dictionary in the source string, excluding the enclosing delimiters, if passed in.
since | 2.10 |
---|
string
data within which the object occurs, typically the start of a PDF document
integer
filesize of the PDF document, for validation purposes, or zero (0) to ignore filesize
mixed
array of dictionary objects on success, false on failure_parse_pdf_UTF16BE(string $source_string) : string
since | 2.10 |
---|
string
PDF string of 16-bit characters
string
UTF-8 encoded string_parse_pdf_dictionary(string $source_string, integer $offset) : array
Returns an array of dictionary contents, classified by object type: boolean, numeric, string, hex (string), indirect (object), name, array, dictionary, stream, and null. The array also has a '/length' element containing the number of bytes occupied by the dictionary in the source string, excluding the enclosing delimiters.
since | 2.10 |
---|
string
data within which the string occurs
integer
offset within the source string of the opening '<<' characters or the first content character.
array
( '/length' => length, key => array( 'type' => type, 'value' => value ) ) for each dictionary field_parse_pdf_string(string $source_string, integer $offset) : array
Returns an array with one dictionary entry. The array also has a '/length' element containing the number of bytes occupied by the string in the source string, including the enclosing parentheses.
since | 2.10 |
---|
string
data within which the string occurs
integer
offset within the source string of the opening '(' character.
array
( key => array( 'type' => type, 'value' => value, '/length' => length ) ) for the string_parse_pdf_xref_section(string $file_name, integer $file_offset) : integer
Creates the array of indirect object offsets and lengths
since | 2.10 |
---|
string
full path and file name
integer
offset within the file of the xref id and count entry
integer
length of the section_parse_pdf_xref_stream(string $file_name, integer $file_offset, string $entry_parms_string) : integer
Creates the array of indirect object offsets and lengths
since | 2.10 |
---|
string
full path and file name
integer
offset within the file of the xref id and count entry
string
"/W" entry, representing the size of the fields in a single entry
integer
length of the stream_parse_pdf_xref_subsection(string $xref_section, integer $offset, integer $object_id, integer $count) : void
A cross-reference subsection is a sequence of 20-byte entries, each with offset and generation values.
since | 2.10 |
---|
string
buffer containing the subsection
integer
offset within the buffer of the first entry
integer
number of the first object in the subsection
integer
number of entries in the subsection
$pdf_indirect_objects : array
This array contains all of the indirect object offsets and lengths. The array key is ( object ID * 1000 ) + object generation. The array value is array( number, generation, start, optional /length )
since | 2.10 |
---|