Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Add Vision capabilities via gpt 4o #123

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

tillydray
Copy link
Contributor

To test, do M-x org-ai-query-image-and-display and follow prompts.

related to #122

just enough to prove that it's working. To test, do `M-x
org-ai-query-image-and-display` and follow prompts. Happy path only for
now

related to rksm#122
**Why:**
"valid-ish" b/c there are still plenty of ways the user can enter
invalid input but these changes cover most likely issues

rksm#122
@rksm
Copy link
Owner

rksm commented Aug 7, 2024

Nice work!

What do you think about using

  (interactive (list
                (read-file-name "Image file path: " nil nil nil (or (thing-at-point 'existing-filename)
                                                                    (thing-at-point 'url)))
                (read-string "Question: " nil 'minibuffer-history)))

when querying the user? This would prefill the input when having the cursor over a filename or link.

@tillydray
Copy link
Contributor Author

tillydray commented Aug 10, 2024 via email

@tillydray tillydray force-pushed the 122/add-vision-capabilities-to-understand-images branch from b147beb to a6319fd Compare October 30, 2024 16:48
**Why:**
I just copy-pasted the code in a6319fd from a pr comment without
thinking about what it did

rksm#122
@tillydray tillydray force-pushed the 122/add-vision-capabilities-to-understand-images branch from 665d6aa to a9149c5 Compare October 30, 2024 20:33
Copy link
Contributor Author

@tillydray tillydray left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self review

org-ai-openai-image-query.el Outdated Show resolved Hide resolved
org-ai-openai-image-query.el Show resolved Hide resolved
(while (string-empty-p input)
(setq input
(read-string (if default
(format "Enter image URL or file path (default %s): " default)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding "URL must start with http?s://

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels bulky to add more text, and people probably already know that urls must start with http[s]:// in emacs right?

(const :tag "gpt-4o" "gpt-4o")
(const :tag "gpt-4-turbo" "gpt-4-turbo")))

(defcustom org-ai-query-image-file nil
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rksm you probably recognize this from org-ai-on-region-file, just calling attention in case you want me to do this in a different way

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants