As mentioned before, we use Portfolio to archive old projects.
However since moving to Portfolio v1.x (from v11), we’ve had numerous issues. The main one has been getting the 40TB of data we host re-cataloged into Portfolio.
I’ve recently deep dived into Portfolio, & found some files that cause issues & these can be found via the logs. Parsing them allows me to “fix” the troublesome files or delete any corrupt ones.
Below is the how I’m parsing the logs.
Contents
Log Files
On OS X , Portfolio stores it’s logs in /Applications/Extensis/Portfolio Server/logs/, however this location can be set within Portfolio Administration under “Global Settings” > “Logging Configuration”.
The full list of logs can be found below:
admin_server_access.log admin_server.log boot.log elasticsearch_index_indexing_slowlog.log elasticsearch_index_indexing_slowlog.log elasticsearch.log extensis.admin.log jboss.out media-delegate_access.log media-delegate.log native-agent.log native-processor.log server.log usage.log
The two main logs files that hold the information we need are:
media-delegate.log server.log
media-delegate.log
Within media-delegate.log you’ll see entries like the below for any files that Portfolio has had an issue processing:
2016-01-11 04:43:28,896 [] ERROR [extensis.portfolio.mediaprocessor.BaseClassServlet] invalid json org.json.JSONException: A JSONArray text must start with '[' at character 1 of Error: File format error - /Volumes/Archive2/autumn research/new research/b110.gif [{ "SourceFile": "/Volumes/Archive2/autumn research/new research/b110.gif" }] at org.json.JSONTokener.syntaxError(JSONTokener.java:448) at org.json.JSONArray.<init>(JSONArray.java:107) at org.json.JSONArray.<init>(JSONArray.java:146) at extensis.portfolio.mediaprocessor.ExifToolFacility.process(ExifToolFacility.java:175) at extensis.portfolio.mediaprocessor.DefaultMediaHandler.handleMedia(MediaHandler.java:248) at extensis.portfolio.mediaprocessor.MediaProcessorServlet.handleProcessFile(MediaProcessorServlet.java:543) at extensis.portfolio.mediaprocessor.MediaProcessorServlet.doPost(MediaProcessorServlet.java:451) at javax.servlet.http.HttpServlet.service(HttpServlet.java:644) at javax.servlet.http.HttpServlet.service(HttpServlet.java:725) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:136) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79) at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:610) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:526) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1078) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:655) at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:222) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1566) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1523) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745)
You’ll see multiple entries with similar for the same file whilst Portfolio tries & tries again to generate a preview for the file.
The bit we need is:
"SourceFile": "/Volumes/Archive2/autumn research/new research/b110.gif"
And from that, just the path to the file itself.
Some other entries might look like:
2016-01-11 07:31:37,617 [] ERROR [extensis.portfolio.mediaprocessor.BaseClassServlet] No workfile produced from /Volumes/Archive2/A8 Q3 CL 000 HOVER.eps. Ghostscript output: Error: /configurationerror in --setpagedevice-- Additional information: [/PageSize [0 0]] Operand stack: --nostringval-- (%%BoundingBox: 0 0 0 0) --dict:1/2(ro)(G)-- --nostringval-- --nostringval-- (%%BoundingBox: 0 0 0 0) 1 0 0 --dict:1/1(L)-- Execution stack: %interp_exit .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- --nostringval-- --nostringval-- false 1 %stopped_push 1914 1 3 %oparray_pop 1914 1 3 %oparray_pop 1913 1 3 %oparray_pop --nostringval-- 1897 1 3 %oparray_pop 1787 1 3 %oparray_pop --nostringval-- %errorexec_pop .runexec2 --nostringval-- --nostringval-- --nostringval-- 2 %stopped_push --nostringval-- %ztokenexec_continue (scanner state) --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- --nostringval-- 1882 10 3 %oparray_pop --nostringval-- --nostringval-- --dict:1/10(L)-- --nostringval-- 9 %dict_continue Dictionary stack: --dict:1174/1684(ro)(G)-- --dict:0/20(G)-- --dict:81/200(L)-- Current allocation mode is local GPL Ghostscript 9.06: Unrecoverable error, exit code 1
Again from that all we want is the path from the below:
2016-01-11 07:31:37,617 [] ERROR [extensis.portfolio.mediaprocessor.BaseClassServlet] No workfile produced from /Volumes/Archive2/A8 Q3 CL 000 HOVER.eps. Ghostscript output: Error: /configurationerror in --setpagedevice--
This log seems to pick up files that might be corrupted or damaged as well as some that are in a format that Portfolio cannot generate a preview from.
Some example errors received from opening some files found via the above are shown below:
server.log
Within server.log, we find entries for troublesome files like the below. The files seem to be in a “bad format” that causes Portfolio to struggle generating previews.
2016-01-11 23:59:53,126 ERROR [extensis.portfolio] (Asset Processor 7) error processing file ::SERVER:Archive2:clan crest.eps: java.io.IOException: Server returned HTTP response code: 500 for URL: http://localhost:40372/portfolio-media-processor/MediaProcessor/ at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1626) [rt.jar:1.7.0_71] at extensis.portfolio.server.mediaprocessing.MediaProcessor$FileProcessor.handleInput(MediaProcessor.java:603) [portfolio-classes.jar:] at extensis.portfolio.util.HttpURLConnectionWrapper.sendHttpRequest(HttpURLConnectionWrapper.java:24) [portfolio-classes.jar:] at extensis.portfolio.server.mediaprocessing.MediaProcessor.processFile(MediaProcessor.java:813) [portfolio-classes.jar:] at extensis.portfolio.server.mediaprocessing.MediaProcessor.sendRequestInner(MediaProcessor.java:54) [portfolio-classes.jar:] at extensis.portfolio.server.mediaprocessing.MediaProcessor.sendRequest(MediaProcessor.java:43) [portfolio-classes.jar:] at extensis.portfolio.server.service.Cataloger.processFile(Cataloger.java:990) [portfolio-classes.jar:] at extensis.portfolio.server.service.Cataloger.updateAsset(Cataloger.java:843) [portfolio-classes.jar:] at extensis.portfolio.server.service.Cataloger.updateAssetById(Cataloger.java:444) [portfolio-classes.jar:] at extensis.portfolio.server.async.QueueItems$UpdateItem.executeInternal(QueueItems.java:165) [portfolio-classes.jar:] at extensis.portfolio.server.async.QueueItems$RetryItem.executeInternal(QueueItems.java:247) [portfolio-classes.jar:] at extensis.portfolio.server.async.QueueItems$MediaEngineItem.executeWithMediaEngine(QueueItems.java:107) [portfolio-classes.jar:] at extensis.portfolio.server.async.QueueItems$UpdateItem.execute(QueueItems.java:156) [portfolio-classes.jar:] at extensis.portfolio.server.async.Runner.run(AssetProcessingEngine.java:502) [portfolio-classes.jar:] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_71]
Where the bit we’re interesting in is:
error processing file ::SERVER:Archive2:clan crest.eps:
I’ve seen this will files in legacy formats (such as .art files) as well as EPS files that are not PDF compatible (that option is shown below).
As mentioned, EPS files seem to be a big issue. On opening some of these i’ve had prompts like the below. Now note, this is 2016 & i’m opening files in CS6 that are considered “legacy”. So some of these files & formats go way back:
More parsing
More parsing can be done as Portfolio leverages several open source technologies under such as:
Amongst others.
Each outputs a differing error to the logs, & so you might want to change what is parsed depending on your needs.
The Script(s)
I have 2 versions of the Python scripts I’m using to parse the logs.
One script is for running frequently to check todays logs, the other is to run to check yesterdays.
The idea is to run the today script whilst cataloging a lot of items, & then only run the yesterday script going forward.
However, Portfolio does not always rotate it’s logs. So the yesterday script will try & find the log files that are appended with todays date. If that fails it will use the current logs.
If neither script can find todays logs, it will error & send an email.
If any entries based on the above criteria are found in either script, they are then put into a list & the duplicates are removed, entries ordered & then the script writes the files to the scripts logs file which it overwrites with every run & will email you then contents of if any files are found.
You’ll need to set the values of the variables found in the “# Variables to set manually” section yourself.
Output
The scripts will write to their log file with every run, with each run overwriting it’s contents.
If any files found or the log files are missing, they will then email. The email will have a subject like the 1st line & content like the following.
Today Bad Files.py: 6 Bad files & 5 Corrupt files found ----- 6 Bad Format Files ------ SERVER:Archive2:artwork:chav bodies.art: SERVER:Archive2:artwork:lines.art: SERVER:Archive2:artwork:moon surface.art: SERVER:Archive2:artwork:mother trucker.art: SERVER:Archive2:artwork:pink drawing.art: SERVER:Archive2:scan in girls:illustration1.art: ----- 5 Corrupt Files ----- "/Volumes/Archive1/Portfolio/image ref/3752674533_3063245233_b-1.jpg" "/Volumes/Archive1/Portfolio/image ref/3752674533_3063245233_b.jpg" "/Volumes/Archive1/Portfolio/image ref/3752676055_6491eed6c9_b.jpg" "/Volumes/Archive1/Portfolio/image ref/3752677315_64bdf53fd7_b.jpg" "/Volumes/Archive1/Portfolio/image ref/3768850119_33bfb67d2b_b.jpg"
Today
This script can be found here or below:
Yesterday
This script can be found here or below: