arkohut 5 days ago

Memos is a privacy-focused passive recording project. It can automatically record screen content, build intelligent indices, and provide a convenient web interface to retrieve historical records.

This project draws heavily from two other projects: one called Rewind and another called Windows Recall. However, unlike both of them, Memos allows you to have complete control over your data, avoiding the transfer of data to untrusted data centers.

  • walterbell 5 days ago

    > avoiding the transfer of data to untrusted data centers

    In short order, this will create a large corpus of unsecured local data.

    Is the user expected to secure the data independently?

    Do Recall/Rewind help the user to filter recorded data for retention or deletion?

    • demarq 4 days ago

      This is odd. Why would you secure this piece of data and leave everything else open?

      Surely you encrypt your disk rather than trying to secure this one app? I mean there’s far more valuable stuff to on your machine than anything this app could possibly store

      • yencabulator 4 days ago

        Things that are fleetingly on-screen are not commonly stored on disk forever. That changes with these apps.

    • arkohut 5 days ago

      Rewind and Recall also store similar data locally but maybe not only locally. And Recall/Rewind allow data deletion, they can retain the most recent data based on time.

      • patrickhogan1 5 days ago

        Rewind and Recall are 2 separate projects and 2 separate installers. I use Rewind and I have several outbound network monitoring apps as well as local disk monitoring apps. Rewind does not send data offsite.

        Rewind does glitch sometimes specifically with audio recording which is extremely annoying. You go back to an area where you thought you had audio notes only to find you didn’t - even though you had audio recording turned on the whole time. It has something to do with meeting detection. Which is silly bc disk space is cheap just auto record. I do like the concept of an open source version and I will look into this.

      • walterbell 5 days ago

        Thanks to PR debacle, Recall now encrypts the data in a VM, https://www.windowscentral.com/software-apps/windows-11/wind...

        • arkohut 5 days ago

          If this is very important, I suppose I will implement encryption for stored data in future versions.

          However, I still have a question about this: it seems that lots of hard disk is already encrypted. After all, I also store a large amount of personal photos, documents, bills, and other important information on my computer, and I haven’t meticulously encrypted all this data again. Should I be doing that?

          • pstoll 5 days ago

            It’s a question of risk.

            Full disk encryption targets a different threat model - disk encryption protects against someone grabbing your computer.

            Writing into an encrypted blob on disk adds a layer of protection against bad actors exfiltrating data by running code on the laptop.

            Overall I really am amazed that this sort of thing is now possible and appreciate a privacy-aware / local compute and storage version of it!

    • alwayslikethis 5 days ago

      You can't have it both ways. You can either own your data and secure it yourself or you can entrust it to someone else and hope they don't leak it (they will). A lot of the data is already stored in your computer anyways, such as your browser history.

    • 627467 5 days ago

      "large corpus of unsecured unsecured local data" is this much worse than unencrypted outlook mailbox (pst or est)? Or offline files from your Dropbox/GDrive/etc? Or your browser profile?

      I guess it's worse in the sense that it also records audio, but large corpus of information is already at risk on a unsecure or compromised devices

      • arkohut 4 days ago

        I don’t record audio because I believe this is already a built-in feature in many meeting software applications.

ghssds 5 days ago

Everything you ever do with your computer while using this is subpoenable. To quote Dick Jones: "He's a cyborg, you idiot! He recorded every word you said. His memory's admissible as evidence!"

  • arkohut 4 days ago

    Herein lies the paradox: I want a tool that helps me record more information, but I don’t want this information to be easily exposed to others or used as evidence against me for things I’ve done. Yet, there are moments when I genuinely need to share this information—to prove what I have or haven’t done. The critical bottom line, however, is that the records must remain untampered. If I could alter them at will, their value and meaning would be entirely lost.

    • hiatus 4 days ago

      > but I don’t want this information to be easily exposed to others or used as evidence against me for things I’ve done

      Even if you wrote it in a diary it could still be used as evidence against you. The only untouchable place is your mind.

  • samiq 4 days ago

    I really wonder how much of an issue this sort of thing is, is people out there actively thinking they are going to get caught? For what? Like Eric Schmitt said once, if you have nothing to fear, you have nothing to hide.

    I’m personally fascinated by this sort of reactions.

    • hiatus 4 days ago

      Who has nothing to fear? Would you offer unrestricted screen recordings of your computer to the public for their perusal?

arkohut 4 days ago

I gave a bad name "memos" for the project. But there is a great open source project named "memos" over there. So I quickly changed the name to "Pensieve". Sorry for giving such confusing...

RileyJames 5 days ago

I deleted Rewinds over the weekend, after I noticed it had eaten 20gb of storage.

I hadn’t used it since installing it. So that came as a surprise.

I then tried using it, and couldn’t get it to find things I knew were in my history. (Basic keywords match)

So I deleted it.

I like the idea of this app. It ticks all the boxes. But I haven’t found any value on this category of app yet.

  • arkohut 5 days ago

    Let me share a scenario I personally encountered. One day, I came across an introduction to a Star Wars animation on a video recommendation website. I was drawn to the fancy covering image, which briefly flashed by in the homepage slides. However, I got busy with other things and only had time to search for it later that evening. Without this project, such information would have been almost impossible to find back because, as part of the website’s recommendation system, the content changes every time you refresh the web page. But this time was different, I was able to retrieve the content by searching for the keyword “Star War” and narrowing the search by time range.

    Of course, I know that such a feature might seem trivial. Some things are simply forgotten and that’s fine. But what if it is a more important clue, like a bug for the web site which only trigger in some narrow condition and hard to reproduce.

mdaniel 5 days ago

Ah, I see the commit that renamed the repo[1] because the title says "Memos" and the URL says "/memos" but the repo was different. I similarly got confused while reading the readme thinking Pensieve was a dependency or something

1: https://github.com/arkohut/pensieve/commit/e81057d5bebcf9cab...

  • arkohut 5 days ago

    Sorry for the confusing. I gave a bad name "memos" for the project. But there is a great open source project named "memos" over there. So I quickly changed the name to "Pensieve".

pheatherlite 5 days ago

Great work, op. As others have said, encryption is vital to such a project. In fact if your ethos is privacy, it would be great marketing material to assure users that this is in fact resistant to basic infiltration. I think recall is a fantastic idea, even for professionals and corporate env. But the kind of sensitive information that is handled by employees cannot risk being leaked from such a tool.

  • arkohut 5 days ago

    Thanks for the advice. I will work on this feature.

griomnib 5 days ago

I once saw a news segment about people with photographic memories. They were miserable, they wanted to forget, but the worst days of their lives, as well as the best, were as through they had just happened.

I came away quite glad I don’t suffer from a photographic memory, and while I applaud the project, I prefer to forget things.

  • arkohut 4 days ago

    It is just a tool...Delete them if you wanna forget. And although I don’t have a photographic memory, there are still some things I can never forget, whether they’re good or bad.

rtolsma 5 days ago

strongly recommend you check out the built in Swift APIs for screen capture and OCR. They’re heavily optimized for energy usage, and allow much finer grained controls on what apps are white/blacklisted for privacy

  • arkohut 5 days ago

    Thanks for the advice I will do more about this part. Currently I am using a package named "ocrmac" it helps a lot.

theblazehen 4 days ago

Are there any similar options that work on Linux?

moltar 5 days ago

How’s the performance with Python? What’s the overhead?

  • traverseda 5 days ago

    Have you used much python, or are you just buying into the "python slow" memes?

    Unless they've done something very very wrong performance will be fine. This isn't doing anything where python's overhead would matter.

    It's glueing together some highly optimized code written in other languages, or using python as a DSL to interface with highly optimized libraries like numpy, or generate highly optimized assembly with something like JAX, or if they're really fancy compiling a restricted subset directly to GPU shaders or something.

    Python is plenty fast for most stuff, and when it isn't it has one of the best pathways towards optimization.

  • arkohut 4 days ago

    In fact, this project is indeed very computing-consuming, but it’s not Python’s fault. The main reason lies in the use of several machine learning models:

      1. OCR model
      2. Embedding model
      3. VLM model (optional)
    
    I’ve tried many optimization approaches to ensure it doesn’t affect daily usage, though this comes at the cost of reduced search performance.