paperless-ngx
It can be said that most of today’s time was spent on building the electronic document paperless-ngx
Since it was the docker-compose.yml given by chatgpt
According to the description of this artificial idiot, it made several mistakes and it took more than an hour to find the cause of the error and run it normally
Among them, the multi-language OCR setting caused the error
And I am not very satisfied with the OCR recognition effect, so I am also looking for optimization solutions or alternative solutions

ocr
I originally planned to use the source code of paperless-ngx to set up an API to call other OCRs to fill in text content
But the programming method is too complicated, and now I am too lazy to spend time to solve the problem from the source code
After searching for information, I found that PaddleOCR meets my needs, has high recognition, and also found that it has a window exe client software, which can be used directly
I originally planned to deploy it on the system of Orange Pi arm server, and now direct visual recognition is easier
Although manual recognition and copying of the recognized content to paperless-ngx are required, as long as there is accurate text content recognition, one more manual step does not matter
Project address
https://github.com/hiroi-sora/Umi-OCR/releases
The important thing is that it is free, available offline, and open source
For those who cannot program or have low-spec machines, it is very good

Blog message notification
I built the Android message notification ntfy by myself
It is also open source and free, and can be self-hosted
And does not rely on Google services (GMS-Free)
Notifications can be triggered through APIs, etc., which is suitable for using js fetch in blogs

I have set up message notifications in my blog. When real visitors visit my blog, my phone will receive a notification.
One day has passed. Except for my own visit in Incognito mode, I received no notifications.
Surprisingly, due to incomplete code, I received a message notification from Google crawler.
Sent by Integral Ad Science (IAS) for advertising content analysis, brand safety detection or page content crawling
HomeLab Summary
Let me talk about the server content of my homelab Orange Pi 3B (8GB memory 512GB storage)
I use showdoc (document) frequently every day (the draft of each blog is on this site, as well as other document records)
I use vaultwarden (password management) frequently (combined with Google browser plug-in Bitwarden Password Manager)
Recently, I frequently use the xmind mind map plug-in in nextcloud and the onlyoffice plug-in excel (self-hosted onlyoffice at the same time)
Currently, among the 5 blogs in the blog group, nomad is the main blog, and the others are pending (using wordpress) with data statistics (umami)
And paperless-ngx, which was just deployed today, is used as an electronic archive. It is currently tested with receipts and is OK. Later, I plan to record insurance, banks, restaurants, convenience stores, etc.
There are a total of 7 open source application servers (5 instances of wordpress are deployed and run)
That is, 7 application types and 10 application instances

The above is the usual system data
In addition, aapanel is used as the server panel
FRP is used as a public network proxy (combined with PM2 as a background process manager)
A vultr cloud server is used as a public network ip (so that traffic can be mapped to the local)
That is to say, with only an Orange Pi development board of about $60, all the above services can be deployed
Not only is it free to use, but the data is also in your hands. According to the system status of the dashboard, you can also deploy some lightweight other applications. If you turn off the extra 4 wordpress blog instances, you can continue to add other open source applications, such as the other 3 types
I deployed some services such as minio, memos, wzi, planka, drawio, etc., but they are not turned on. Turn them on when needed
I also have other Orange Pi and Raspberry Pi and an x86 small host of unknown brand, and some other services are deployed
But currently none of them can make money, you can only use it yourself. It is recommended that those who don’t understand technology don’t bother, it’s more time-consuming