Cookie Consent by Free Privacy Policy Generator

The Best Of

Go to the Best Of the SEO Community.

Darwin
Darwin
Feb 16, 2024, 9:54 AM
Forwarded from another channel:
Is anybody working in embedding crawl data as vectors?
Forwarded thread from another channel:
Kristin Tynski
Kristin Tynski
Feb 16, 2024, 10:12 AM
What are you trying to do?
Darwin
Darwin
Feb 16, 2024, 10:44 AM
Crawl data analysis through chat, a la code interpreter.
Kristin Tynski
Kristin Tynski
Feb 16, 2024, 1:42 PM
have you tried openinterpreter?
Kristin Tynski
Kristin Tynski
Feb 16, 2024, 1:42 PM
langchain has a million different embedding schemes
Kristin Tynski
Kristin Tynski
Feb 16, 2024, 1:44 PM
llamaindex also has a very easy RAG pipeline
Unlock the power of RAG pipeline with LLama Index. Streamline document extraction, embeddings, and domain-specific QA apps efficiently.
Written by
Sunil Kumar Dash
Est. reading time
12 minutes
Analytics Vidhya: Build a RAG Pipeline With the LLama Index
Darwin
Darwin
Feb 16, 2024, 2:20 PM
Thank you, Kristin! I've tried openinterpreter and Langchain; however, I was hoping some of the crawling tools providers were already working on this. If nobody is, I guess I'll have to start looking to build something around it.
Elias
Elias
Feb 16, 2024, 3:54 PM
Open source crawling tool provider here. Sounds interesting.
Let me know if I can help.
Darwin
Darwin
Feb 16, 2024, 5:59 PM
Absolutely! sending a DM, thank you!
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 12:26 PM
yeah, Ive been looking for an LLM powered web crawler/smart scraper too, but nothing great so far. This is probably the closest
We are a technology and design group exploring novel ways to use AI to expand what people are capable of.
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 12:26 PM
which has an api that I havent tried yet
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 12:28 PM
When it comes to web enabled agents though, nothing really has better than like a 25% success rate with tasks yet. Though there are companies working on models trained specifically to do this sort of thing. Benchmarks for it just recently came out.
VisualWebArena is a benchmark for multimodal agents. - web-arena-x/visualwebarena
GitHub: GitHub - web-arena-x/visualwebarena: VisualWebArena is a benchmark for multimodal agents.
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 12:42 PM
This is an interesting direction IMO
Darwin
Darwin
Feb 17, 2024, 1:16 PM
This by @pletzer @jordan.choo @dale and you makes me think some might have started down this path. Embedding keywords + crawl data + GSC data would unlock and facilitate lots SEO tasks.
Forwarded from another channel
I’m curious: Anyone using embeddings? What’s your workflow? I’m still toying around with mapping the data but I am looking for “real” use-cases
5 replies
Dale McGeorge
Nice one @jordan.choo, Have you tried any chunking methods? It might also help refine where to put the link within the content.
JC
@dale I have not and didn’t even think about it. I’ll have to give that a whirl
Kristin Tynski
Yes, doing a lot with embeddings for content creation pipelines Im working on. Useful in many ways. Im using it for my RAG/Retrieval, and doing some fun stuff with neo4j and integrating graph traversal with nearest neighbor embedding lookup on a knowledge graph
Darwin
Darwin
Feb 17, 2024, 1:25 PM
@jordan.choo's article is a perfect example of such use case
Discover how to build internal links at scale using AI and vector search in this step by step tutorial. Read on to find out more
Written by
Jordan Choo
Time to read
6 minutes
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 1:27 PM
I've tried this:
In this blog, you will learn how to use the neo4j-advanced-rag template for Retrieval Augmented Generation and host it using LangServe.
Graph Database & Analytics: Implementing Advanced Retrieval RAG Strategies With Neo4j
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 1:28 PM
Can use langchain to get a sitemap , then load the sitemap into a graph database as nodes/edges that are generated using GPT4
Extends from the WebBaseLoader, SitemapLoader loads a sitemap from a
Sitemap | :parrot::link: Langchain
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 1:30 PM
Retrieval on it can be done in a really intersting way, where you do knn to find the parent node closes to your query, then you return that node, and attached nodes and their relationships, or you can do more sophisticated graph traversals using things like pagerank to return more nodes/relationships
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 1:31 PM
it also auto-generates questions that are answered by the parent nodes and attaches them as child nodes with edge relationships
Kristin Tynski
Kristin Tynski
Feb 17, 2024, 1:31 PM
which can provide a lot of additional SEO value, especially applied as a step in a content writing pipeline
Darwin
Darwin
Feb 17, 2024, 1:41 PM
Also wondering if @sugan or @derek are testing things in this space undefined
Derek Perkins
Derek Perkins
Feb 17, 2024, 2:07 PM
Yeah, pretty deep into it, and now that BigQuery supports vector search, that's the final piece for us
Forwarded from another channel
this just launched today, and I'm pumped
5 replies
Dale McGeorge
End of the month
Derek Perkins
Is there a link or sign-up yet?
Dale McGeorge
Derek Perkins
Derek Perkins
Feb 17, 2024, 2:11 PM
The hardest thing is commiting to a chunking and then an embedding model because it's going to be cost prohibitive to re-embed, and the space is moving so fast
Derek Perkins
Derek Perkins
Feb 17, 2024, 2:24 PM
Another issue from a tool provider perspective is having to rely on general models. If an in-house team is crawling their own site, they can pick an ecommerce tailored model, where that's not really feasible, especially if you want to compare across sites
Robin Allenson
Robin Allenson
Mar 21, 2024, 1:11 AM
Maybe it’s worth looking at the Fuyu family of multimodal models from Adept. They are large action models trained to make sense of and take action on web page data?
We’re open-sourcing Fuyu-8B - a small version of the multimodal model that powers our product.
Fuyu-8B: A Multimodal Architecture for AI Agents

Our Values

What we believe in

Building friendships

Kindness

Giving

Elevating others

Creating Signal

Treating each other with respect

What has no home here

Diminishing others

Gatekeeping

Taking without giving back

Spamming others

Arguing

Selling links and guest posts


Sign up for our Newsletter

Join our mailing list for updates

By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.

Apply now to join our amazing community.

Powered by MODXModx Logo
the blazing fast + secure open source CMS.