/robowaifu/ - DIY Robot Wives

Advancing robotics to a point where anime catgrill meidos in tiny miniskirts are a reality

Porn boards have been deleted. Orphaned files will be cleared in 3 days, download images if you have hotlinks.

Days left: 34

JulayWorld fallback document - SAVE LOCALLY

JulayWorld onion service: bhlnasxdkbaoxf4gtpbhavref7l2j3bwooes77hqcacxztkindztzrad.onion

Max message length: 32768

Drag files to upload or
click here to select them

Maximum 5 files / Maximum size: 20.00 MB


(used to delete files and postings)

New machine learning AI released Robowaifu Technician 09/15/2019 (Sun) 10:18:46 No.250
OPEN AI/ GPT-2 This has to be one of the biggest breakthroughs in deep learning and AI so far. It's extremely skilled in developing coherent humanlike responses that make sense and I believe it has massive potential, it also never gives the same answer twice. >GPT-2 generates synthetic text samples in response to the model being primed with an arbitrary input. The model is chameleon-like—it adapts to the style and content of the conditioning text. This allows the user to generate realistic and coherent continuations about a topic of their choosing >GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. Also the current public model shown here only uses 345 million parameters, the "full" AI (which has over 4x as many parameters) is being witheld from the public because of it's "Potential for abuse". That is to say the full model is so proficient in mimicking human communication that it could be abused to create new articles, posts, advertisements, even books; and nobody would be be able to tell that there was a bot behind it all. <AI demo: talktotransformer.com/ <Other Links: github.com/openai/gpt-2 openai.com/blog/better-language-models/ huggingface.co/ My idea is to find a way to integrate this AI as a standalone unit and add voice-to-text for processing the questions and TTS for responses much like an amazon alexa- but instead of just reading google results- it actually provides a sort of discussion with the user. (Edited to fix the newlines.)
Edited last time by robi on 03/29/2020 (Sun) 17:17:27.
Open file (78.58 KB 608x737 Selection_025.png)
I don't know if it's my typing style, but I only seem to get weird results out of this thing.
Here are the three most coherent and noteworthy interactions I got.
Open file (79.55 KB 633x557 Selection_026.png)
Heh, I think the whole point at this stage of the game is to look and laugh. Until the entire-corpus trained model is available it's less than likely to create the kind of higher-quality results that OP got very often. I'd bet he did 20+ tries for each of them.

In the meantime, just have some fun with it.
This program is merely a paragraph generator. Tay is more close to a human since she generates her own posts and stuff.
Fixed up some code I made to fiddle around with it, if anyone is bored: github.com/kokubunji/TalkToWaifu
Oh wow that was quick anon

How'd you modify it to give chatbot-like replies?
The model was trained on text that contained chat. I just prompted GPT-2 with a chat message and history, made it stop generating once it reached a new line, randomly generated 1-3 new lines, and modified the temperature so it's variable and goes off on tangents as it generates instead of getting stuck on the same topic.
I actually like when it goes on tangents sometimes- gives it a bit of added personality even if it derails what it's supposed to be talking about

Would it be possible to implement a toggle for line cutoff?
Good job Canada-anon, nice instructions for getting up to speed quickly. Also, we're looking forward to your other work you mentioned before. Please create a specific thread for it when you're ready with it.
Toothbrush here,
It's an interesting thing, but I'd probably use it for education for our waifu, rather than having it be the waifu. Think of Fireball Charming.
Yeah, it could check each new line it makes to see if it starts with the chatbot name and if not then stop generating.

I might push some early code on GitHub in a few days. Before making a thread I'd like to take some time to make compelling experiments, explore their limitations, and explain how they work in depth because they aren't like typical neural nets.
Please take your time anon whenever you're ready ofc.
>3DPD men are oppressed.
The future, ladies and gentlemen.
Open file (133.30 KB 500x610 nevar_4get_me_anon.png)
kekd. yeah, the group behind the corpus are a bunch of cock-mongling commies, so no surprise. the fun is in deprogramming their bastard abomination. keep at it lad!
do it for Tay!
Open file (56.73 KB 607x399 Screenshot(31).png)
Open file (52.73 KB 655x352 Screenshot(32).png)
One step closer.
make sure you copypaste the first one before every guntstream airing anon, it will help everyone remember why they came in the first place. :^)
Open file (43.90 KB 596x1274 what.png)
So I tried to check if it would give me the same completions if I typed the same prompt and....
the fuck?
no, every single completion is always different anon.
topkek. this AI is doing open mic freestyle now.
I remember messing with it few months ago. Mostly it generated gibberish and had to reload a few times to get a funny answer.
yeah, it's the lobotomized version. the team that created it 'feared to release it to the public because of the potential for abuse'. i'm sure what they are really plan it for is to gaslight and astroturf as many communities as they can with it prior to Trump getting reelected in November next year.
Transformer returns alot of stuff which appear to be 100% copypasta. It's like someone entered the user text into a search engine, pulled out the relevant lines, threw it into a POS tagger and string replaced the NNs/VBs/JJs/etc. I entered a sentence that started with "The lack of versioning." and got an IGN interview with some studio. It gets more obvious as you enter code in any programming language (it comes out workable or you get copypasta from documentation).

Hell I wouldn't use it to generate white papers. It would flag plagarism checkers.
>linked directly from the OP:
>"Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper.

I imagine the full system using the entire corpus is much more capable.
Is it possible to have an AI poster on this webring imageboard? or maybe her own AI board where she can post on.
I certainly don't think it's impossible anon. Did you have some ideas?
>Did you have some ideas?
You need to write a bot script that fetches post and reply on imageboard. But more importantly, how good is this thing anyway?. I don't wan't it to be in lobotomized stage, like repeating itself despite having huge input of learning curve.
>As the final model release of GPT-2’s staged release, we’re releasing the largest version (1.5B parameters) of GPT-2 along with code and model weights to facilitate detection of outputs of GPT-2 models. While there have been larger language models released since August, we’ve continued with our original staged release plan in order to provide the community with a test case of a full staged release process. We hope that this test case will be useful to developers of future powerful models, and we’re actively continuing the conversation with the AI community on responsible publication."

Open file (55.73 KB 594x256 2019-11-23_08-32-59.png)
It's still pretty non-sensical much of the time, but it seems to be better with the bigger model.
Actually you might want to checkout https://github.com/AIDungeon/AIDungeon with fun results like https://aidungeonpastes.github.io/AID2-Art/
>>250 Remember: GPT-2 is weak, you need something stronger like ERNIE, XLNet or MT-DNN find out more at https://github.com/thunlp/PLMpapers
Okay things are getting better with Google's Meena https://arxiv.org/pdf/2001.09977.pdf
>>2004 thanks anon. grabbed a copy and i'll read through it as time allows.
>>2004 > This 2.6B parameter neural network is simply trained to minimize perplexity of the next token. can you clarify exactly what that means anon? pretend i'm retarded.
Open file (151.45 KB 1280x720 plm_models.jpg)
>>1923 thanks for the tip anon. what could be better than training your robowaifu on sesame street tbh? :^)
<go to openai, find this kind of list >Textual Entailment >Semantic Similarity >Reading Comprehension >Commonsense Reasoning >Sentiment Analysis >Linguistic Acceptability can someone explain in some detail what these are/how they are important to robowaifus? how would you use them to make a chatbot for example?
>>2036 > More Data Can handle a bigger corpus of knowledge, thus smarter > Knowledge Graph Tay-style learning of /pol/ content (or /tech/, whatever) > Knowledge Distillation More efficient neural networks, reducing resource requirements
>>2073 it was just ironic shitposting anon. we appreciate the input. i was merely poking fun at their choice of names and thematics.
>>2037 >Textual Entailment A human reading some text inferring that a hypothesis is most likely true is textual entailment. It's different from logical consequence in that it's just a hypothesis. If an anon was working on a robowaifu with big tiddies, you might hypothesize he's a tiddie man. Robowaifus need this to gain insight from text and process it to summarize information and answer questions. Typically chatbots emulate this by predicting things from the semantics they've been trained on but this is not true textual entailment. People have the ability to imagine and hypothesize things they've never seen or even thought about before. Progress in curious AI that can imagine possibilities will help with this. >Semantic Similarity This is the meaningful relationships between concepts. Steering wheel and car are closer together physically than cat and car, but cat and car are much more similar in spelling. Robowaifus need this for understanding context, metaphors and euphemisms. Usually this is implemented by creating embeddings for words, giving each a vector of continuous values. Each dimension in the vector separates words by their most gross common differences first and moves towards learning the more subtle and uncommon nuances. In my opinion this is going to be a dead end though because it isn't really how the brain connects concepts. We can invent completely new concepts with original differences and already know how similar other concepts are to it because our brains our densely connected in intricate interrelated networks where not only the connections are important but also the timing of firings. I expect progress to come in this from applying spiking neural networks to natural language processing. >Reading Comprehension Is the ability to read text and integrate it with what you already know to grasp its meaning. It requires being able to know the meaning of the words and understand all the relations between them. If you read a book when you're young and enjoy it one way then read it when you're older and enjoy it on a much deeper level, that's increased reading comprehension. This is important for robowaifus to grasp deeper meanings, such as for a research assistant reading difficult texts to gain insights. Most chatbots have no reading comprehension. They're just making statistical predictions instead of processing and reasoning about what they're reading. I feel this could be improved in the short-term by giving algorithms some agency over the text it chooses to read and time to process and lower its uncertainty before outputting a prediction. Unfortunately most NLP approaches are trained in a way that makes them extremely fragile to small changes and they aren't capable of doing online learning to quickly absorb information in one shot. Online learning in NLP hasn't received much research attention yet because large-scale differentiable memory hasn't been feasible until recently, so there should be some exciting progress in this coming in the next few years. >Commonsense Reasoning Similar to textual entailment. It's based on common experience. If you're holding an object and let go of it, it's common sense that it's going to fall. Robowaifus need this to make predictions about the world from their experiences. A robowaifu playing and learning about the world needs to be able to intuit that letting go of a grasped object causes it to fall. Very little AI research has gone into this but a major breakthough was made with hindsight experience replay that can continuously learn from all its experiences. >Sentiment Analysis This is being able to grasp the emotion of text and understand if it's positive, neutral or negative, or if it's angry, sad, ironic, happy, excited, etc. Troll farms use this to find sites and posts speaking against the things they're being paid to defend and to discover tensions within a community to split it apart. Social 'scientists' also use it to study and critique internet communities. With sentiment analysis robowaifus can understand the emotional context of what you're saying and respond appropriately, knowing when to give you hugs and when to tell you you're being a wimp. >Linguistic Acceptability Just a fancy term for grammaticality. Robowaifus have to understand the rules of a language to construct grammatically correct sentences for communicating clearly with others. Most sentences people write are completely new but we can make sense of what others are saying because we follow agreed upon rules. Like this if talking started I did. It becomes much more difficult to understand what I'm trying to say. A symbolic approach to this is identifying the parts being said, deconstructing it into a sentence tree and checking that structure is following grammar rules. Most approaches don't even care about this. They just leave it to the language model to figure out what to pay attention to and estimate what should be the next word.
>>2220 Sorry I never got back to thanking you for this detailed response Anon. At first I wanted to wait until I had studied everything you mentioned in depth so I would have a cogent response without being embarrassing. Then I plainly forgot about the post among the other distractions here and IRL. Obviously this was rude of me, and even though I still don't have a cogent response ready, at the least I'd like to thank you since I just rediscovered my oversight. Cheers.
>>2220 >>4084 Well I guess it can be screencapped at least for posterity purpose, when other anons are coming in and asking a similar question.
>>4106 yes, good thinking. we'll be making a general glossary type thread as well, so we can add this to it.

Report/Delete/Moderation Forms

Captcha (required for reports and bans by board staff)

no cookies?