Hei,

Jeg tror at det kan være et problem med certbot-sertifikatet (mangler tilgang til https://nikitatest.uio.no/) eller lokalt DNS-problem.

Kan dette skyldes at du kaller --baseurl https://nikita.example.com/noark5v5/ i stedet for --baseurl https://nikitatest.uio.no/noark5v5/?

Eller har du satt nikita.example.com i /etc/hosts til å peke mot 129.240.15.244 (nikitatest.uio.no)?

Du finner konfigurasjonsinformasjon for Apache HTTPD og Nikita bak brannmurer med https basert på certbot-auto og Apache-modulen
mod_rewrite i https://github.com/arkivarium/book

   HTTPD Configuration
    Install Apache 2 and download the core from gitlab.com in /var/www/html/
          cd /var/www/html/
      git clone https://gitlab.com/OsloMet-ABI/nikita-noark5-core
   
    Configure Apache 2 in /etc/apache2/sites-available/000-default.conf
          VirtualHost www.arkivarium.no:80
      ServerName www.arkivarium.no
      ServerAdmin webmaster@arkivarium.no
      DocumentRoot /var/www/html/nikita-noark5-core/web/
      ErrorLog ${APACHE_LOG_DIR}/www.arkivarium.no-error.log
      CustomLog ${APACHE_LOG_DIR}/www.arkivarium.no-access.log combined
      /VirtualHost

      VirtualHost arkivarium.no:80
      ServerName arkivarium.no
      ServerAdmin webmaster@arkivarium.no
      DocumentRoot /var/www/html/nikita-noark5-core/web/
      ErrorLog ${APACHE_LOG_DIR}/arkivarium.no-error.log
      CustomLog ${APACHE_LOG_DIR}/arkivarium.no-access.log combined
      RewriteEngine on
      RewriteCond %{SERVER_NAME} =arkivarium.no
      RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
      /VirtualHost
   
    Replace {www.}arkivarium.no with your own domain name and configure DNS settings.
    Add DNS records on the name servers for your domain to the IP address of your web server.
          arkivarium.no A 178.255.144.179
      www.arkivarium.no A 178.255.144.179
   
    Remember to replace the domain arkivarium.no and IP address 178.255.144.179 with the actual domain and IP address of your web server running Apache.
    Enable HTTPD configuration in /etc/apache2/sites-enabled/000-default.conf
    Download certbot-auto from https://certbot.eff.org/ and run certbot --apache -d arkivarium.no (replace arkivarium.no with your domain).  Install the certificates for arkivarium.no and add a redirect to https in certbot.
    Install the Apache 2 HTTP daemon and enable the modules ssl, proxy and proxy_http:
          apt-get install apache2-bin
      a2enmod ssl
      a2enmod proxy
      a2enmod proxy_http
   
    Configure the Apache 2 HTTPD proxy and proxy_http module in /etc/apache2/sites-available/000-noark5v4.conf to access http://localhost:8092/noark5v4/ on http://arkivarium.no/noark5v4/ (replace the domain arkivarium.no with your own domain):
          IfModule mod_proxy.c
      Location /noark5v4
      ProxyPass http://localhost:8092/noark5v4/
      ProxyPassReverse http://localhost:8092/noark5v4/
      RequestHeader set X-Forwarded-Proto "https"
      ProxyPreserveHost On
      /Location
      /IfModule
   
    Add a symbolic link from /etc/apache2/sites-available/000-noark5v4.conf to /etc/apache2/sites-enabled/000-noark5v4.conf
          cd /etc/apache2/sites-enabled/
      ln -s /etc/apache2/sites-available/000-noark5v4.conf
   
    Restart the HTTPD configuration with service apache2 restart

Lykke til.  nginx tror jeg ikke fungerer som frontend til Nikita.  Dette oppsettet var det som kjørte og fungerte på nikita.oslomet.no.

Jeg rakk dessverre ikke å sette opp en lokal instans av nikita på NTNU, men er tilbake ved OsloMet til høsten og kan kanskje bistå med
nylig oppdaterte kunnskaper innen cybersikkerhet og datanettverksteknologi med Openstack, webteknologier som HTML, CSS3, Rust
og NodeJS/NPM.

Hvis du får nginx til å fungere som webfrontend til Nikita, så skriv gjerne oppsettet ned, så kan jeg legge det til i https://github.com/arkivarium/book

Mvh,
Ole

On Tuesday, 25 February 2025 at 15:26:22 +01:00, Petter Reinholdtsen <pere@hungry.com> wrote:

Det ser ut til at samvirke mellom nginx og Nikita kan endre opp med en
ugyldig databasetilstand. Jeg testet nettopp import-email av hele
testmboxen, og det feilet. Feilen opptrer ikke Når nginx ikke står
forran Nikita. Her er importfeilen jeg ser:

% ./import-email --saksmappe --baseurl https://nikita.example.com/noark5v5/ --client_id nikita_client --username user --password prompt --mbox testdata/nikita-noark.mbox --maxinserts 100000
[...]
1.43 s per insert/email (249 inserted, 249 seen)
Looking for msgid: <VI1PR04MB132731D552E24DB0C554144092F80@VI1PR04MB1327.eurprd04.prod.outlook.com>
MappeID: dict_keys([])
format ignore for multipart/alternative not known by API, asking API to pick format code
format None for text/plain not known by API, asking API to pick format code
format None for text/html not known by API, asking API to pick format code
Identifiers: []
1.83 s per insert/email (250 inserted, 250 seen)
Looking for msgid: <2fly3tpumnw.fsf@diskless.uio.no>
Identifiers: []
0.92 s per insert/email (251 inserted, 251 seen)
Looking for msgid: <2flvaotulrs.fsf@diskless.uio.no>
error: upload threw exception, removing object: HTTP Error 413: Request Entity Too Large
Traceback (most recent call last):
File "/home/user/noark5-tester/./import-email", line 474, in createDocumentObject
(c, res) = self.post(str(newfilehref), content,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/lib/n5core/endpoint.py", line 216, in post
return self._post(path, data, mimetype, length, accept, headers);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/lib/n5core/endpoint.py", line 235, in _post
response = urlopen(request)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 413: Request Entity Too Large

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/user/noark5-tester/./import-email", line 871, in <module>
exit(main())
^^^^^^
File "/home/user/noark5-tester/./import-email", line 868, in main
return t.load() != True
^^^^^^^^
File "/home/user/noark5-tester/./import-email", line 856, in load
return self.loadMbox()
^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./import-email", line 829, in loadMbox
if not self.uploadEmail(message):
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./import-email", line 762, in uploadEmail
return self.addMessageToFile(threadfile, message)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./import-email", line 678, in addMessageToFile
if not self.createDocumentObject(docdescinfo, docobjdata,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./import-email", line 480, in createDocumentObject
self.delete(o)
File "/home/user/noark5-tester/lib/n5core/endpoint.py", line 341, in delete
response = opener.open(request)
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400:
%

Loggen fra Nikita viser følgende:

Feb 25 15:00:42 nikitatest.uio.no java[1390091]:
2025-02-25T15:00:42.363+01:00 ERROR 1390091 --- [Nik ita Noark 5 Core
(using Postgres)] [.1-8092-exec-10] a.w.s.e.GlobalETAGExceptionHandler
: Canno t invoke "String.length()" because "checksum" is null

Dette får så export-all til å feile etter at et dokumentobjekt
returneres uten sjekksum:

% ./export-all --xml --baseurl https://nikita.example.com/noark5v5/ --client_id nikita_client --username user --password prompt
[...]
Traceback (most recent call last):
File "/home/user/noark5-tester/./export-all", line 1657, in <module>
exit(main())
^^^^^^
File "/home/user/noark5-tester/./export-all", line 1652, in main
return t.recurse_xml()
^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./export-all", line 1434, in recurse_xml
self.recurse_xml_save_journal(
File "/home/user/noark5-tester/./export-all", line 1385, in recurse_xml_save_journal
callback()
File "/home/user/noark5-tester/./export-all", line 1435, in <lambda>
lambda : self.recurse_xml_save_endringslogg(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./export-all", line 1348, in recurse_xml_save_endringslogg
callback()
File "/home/user/noark5-tester/./export-all", line 1436, in <lambda>
lambda : self.recurse_xml_arkiv(output, arkiv,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/user/noark5-tester/./export-all", line 1181, in recurse_xml_arkiv
self.recurse_xml_out_ordered(output, 'arkiv', content, sequence, attr=attr, parents=parents)
File "/home/user/noark5-tester/./export-all", line 307, in recurse_xml_out_ordered
k[1](output, sub, parents=[origcontent,]+parents)
File "/home/user/noark5-tester/./export-all", line 1117, in recurse_xml_arkivdel
self.recurse_xml_out_ordered(output, 'arkivdel', content, sequence, parents=parents)
File "/home/user/noark5-tester/./export-all", line 307, in recurse_xml_out_ordered
k[1](output, sub, parents=[origcontent,]+parents)
File "/home/user/noark5-tester/./export-all", line 979, in recurse_xml_mappe
self.recurse_xml_out_ordered(output, 'mappe', content, sequence, attr=attr, parents=parents)
File "/home/user/noark5-tester/./export-all", line 307, in recurse_xml_out_ordered
k[1](output, sub, parents=[origcontent,]+parents)
File "/home/user/noark5-tester/./export-all", line 887, in recurse_xml_registrering
self.recurse_xml_out_ordered(output, 'registrering', content, sequence, attr=attr, parents=parents)
File "/home/user/noark5-tester/./export-all", line 307, in recurse_xml_out_ordered
k[1](output, sub, parents=[origcontent,]+parents)
File "/home/user/noark5-tester/./export-all", line 616, in recurse_xml_dokumentbeskrivelse
self.recurse_xml_out_ordered(output, 'dokumentbeskrivelse', content, sequence, parents=parents)
File "/home/user/noark5-tester/./export-all", line 307, in recurse_xml_out_ordered
k[1](output, sub, parents=[origcontent,]+parents)
File "/home/user/noark5-tester/./export-all", line 506, in recurse_xml_dokumentobjekt
filename = '%s-%s.%s' % (content['sjekksumAlgoritme'], content['sjekksum'], suffix)
~~~~~~~^^^^^^^^^^^^
KeyError: 'sjekksum'
%

Jeg antar utløseren for denne feilen ligger i lav opplastingsgrense i
nginx, og kan fikses med "client_max_body_size 20M;" eller tilsvarende,
men Nikita bør uansett ikke endre opp med e ugyldig
dokumentobjekt-instans når dette skjer.

--
Vennlig hilsen
Petter Reinholdtsen
_______________________________________________
nikita-noark mailing list -- nikita-noark@nuug.no
To unsubscribe send an email to nikita-noark-leave@nuug.no